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[320] 
[320] 
[320] 



330 



340 



350 



360 



370 



380 



390 



400] 
.] 

AGATATCCTT GATCTGTGGG TCTACCACAC ACAAGGCTAC TTC CCTG ATT GGCAGAACTA CACACCAGGG CCAGGGACCA [4 00] 

T. . [400] 

T. . [400] 



[ 410 420 430 440 450 460 470 480] 

[ ........] 

Bnef .mrca GATATCCACT GACCTTTGGA TGGTGCTTCA AGCTAGTACC AGTTG AG C CA GAGAAGGTAG AAGAGGCCAC TGAAGGAGAG [480] 

Bnef. MMcot A [480] 

Bnef. LScot A [480] 



[ 
I 

Bnef .mrca 
Bnef .MMcot 
Bnef . LScot 

[ 
E 

Bnef .mrca 
Bnef .MMcot 
Bnef . LScot 



490 500 510 520 530 540 550 560] 

.] 

AACAACAGCT TGTTACACCC TATGAGCCTG CATGGAATGG ATGACCCGGA GAGAGAAGTG TTAGTGTGGA GGTTTGACAG [560] 

T A G A A [560] 

G A A [560] 



570 



580 



590 



600 



610 



620] 
. ] 

CCGCCTAGCA TTTCATCACA TGGCCCGAGA GAAGCATCCG GAGTACTACA AGGACTG CTG A [621] 

CT [621] 

CT [621] 



Figure 12 

Comparison of Clade B pol Gene Sequence Reconstructions 



[ 10 20 30 40 50 60 70 80] 

[ ........] 

Bpol.mrca TTTTTTAGGG AAAATCTGGC CTTCCCACAA GGGAAGGCCA GGGAACTTTC TTCAGAGCAG ACCAGAGCCA ACAGCCCCAC [80] 

Bpol.LScot G T [80] 

Bpol.MMcot G T [80] 

[ 90 100 110 120 130 140 150 160] 

[ ........] 

Bpol.mrca CAGAAGAGAG CTTCAGGTTT GGGGAAGAGA CAACAACTCC CTCTCAGAAG CAGGAGCAGA TAGACAAGGA ACTGTATCCT [160] 

Bpol.LScot C [160] 

Bpol.MMcot C [160] 

[ 170 180 190 200 210 220 230 240] 

[ ........] 

Bpol.mrca TTAGCTTCCC TCAAATCACT CTTTGG CAAC GACCCCTCGT CACAATAAAG ATAGGGGGGC AACTAAAGGA AGCTCTATTA [240] 

Bpol.LScot G [240] 

Bpol.MMcot G [240] 

[ 250 260 270 280 290 300 310 320] 

[ ........] 

Bpol.mrca GATACAGGAG CAGATGATAC AGTATTAGAA GAAATGAATT TGCCAGGAAA ATGGAAACCA AAAATGATAG GGGGAATTGG [320] 

Bpol.LScot G [320] 

Bpol.MMcot G [320] 

[ 330 340 350 360 370 380 390 400] 

[ ........] 

Bpol.mrca AGGTTTTATC AAAGTAAGAC AGTATGATCA AATACCCATA GAAAT CTGTG GACATAAAGC TATAGGTACA GTATTAGTAG [4 00] 

Bpol.LScot G [4 00] 

Bpol.MMcot G....T [400] 

[ 410 420 430 440 450 460 470 480] 

[ ....... .1 

Bpol.mrca GACCTACACC TGTCAACATA ATTGGAAGAA AT CTGTTG AC T C AG ATTGGT TGCACTTTAA ATTTTCCCAT TAGTCCTATT [4 80] 

Bpol.LScot [480] 

Bpol.MMcot [480] 

[ 490 500 510 520 530 540 550 560] 

[ ........] 

Bpol.mrca GAAACTGTAC CAGTAAAATT AAAG CCAGGA ATGGATGGCC CAAAAGTTAA ACAATGGC C A TTGACAGAAG AAAAAATAAA [560] 

Bpol.LScot [560] 

Bpol.MMcot [560] 

[ 570 580 590 600 610 620 630 640] 

[ ........] 

Bpol.mrca AGCATTAGTA GAAATTTGTA CAGAAATGGA AAAGGAAGGA AAAATTTCAA AAATTGGGCC TGAAAATCCA TACAATACTC [640] 

Bpol.LScot G [64 0] 

Bpol.MMcot G [640] 

[ 650 660 670 680 690 700 710 720] 

[ ........] 

Bpol.mrca CAGTATTTGC CATAAAGAAA AAAGACAGTA CTAAATGGAG AAAATTAGTA GATTTCAGAG AACTTAATAA GAGAACTCAA [720] 

Bpol.LScot [720] 

Bpol.MMcot [720] 

[ 730 740 750 760 770 780 790 800] 

[ ........] 

Bpol.mrca GACTTCTGGG AAGTTCAATT AGGAATACCA CATCCTGCAG GGTTAAAAAA GAAAAAATCA GTAACAGTAC TGGATGTGGG [800] 

Bpol.LScot C [800] 

Bpol.MMcot C [800] 

[ 810 820 830 840 850 860 870 880] 

[ ........] 

Bpol.mrca TGATGCATAT TTTTCAGTTC CCTTAGATGA AGACTTCAGG AAGTATACTG CATTTACCAT ACCTAGTATA AACAATGAGA [880] 

Bpol.LScot [880] 

Bpol.MMcot [880] 



[ 890 900 910 920 930 940 950 960] 

[ ........] 

Bpol.mrca CACCAGGGAT TAGATATCAG TACAATGTGC TTCCACAGGG ATGGAAAGGA TCACCAG CAA TATTCCAAAG TAGCATGACA [960] 

Bpol.LScot [960] 

Bpol.MMcot [960] 

[ 970 980 990 1000 1010 1020 1030 1040] 

[ ........] 

Bpol.mrca AAAATCTTAG AGCCTTTTAG AAAACAAAAT CCAGAAATAG TTATCTATCA ATACATGGAT GATTTGTATG TAGGATCTGA [104 0] 

Bpol.LScot C [1040] 

Bpol.MMcot C [1040] 

[ 1050 1060 1070 1080 1090 1100 1110 1120] 

[ ........] 

Bpol.mrca CTTAGAAATA GGGCAG CAT A GAACAAAAAT AGAGGAACTG AGAGAACATC TGTTGAGGTG GGGATTTACC ACACCAGACA [1120] 

Bpol.LScot C [1120] 

Bpol.MMcot C [1120] 

[ 1130 1140 1150 1160 1170 1180 1190 1200] 

[ ........] 

Bpol.mrca AAAAACATCA GAAAGAACCT CCATTTCTTT GGATGGGTTA TGAACTCCAT CCTGATAAAT GGACAGTACA GCCTATAGTG [1200] 

Bpol.LScot C [1200] 

Bpol.MMcot C [1200] 

[ 1210 1220 1230 1240 1250 1260 1270 1280] 

[ ........] 

Bpol.mrca CTGCCAGAAA AAGACAGCTG GACTGTCAAT GACATACAGA AGTTAGTGGG AAAATTGAAT TGGG CAAGTC AGATTTATGC [12 80] 

Bpol.LScot [1280] 

Bpol.MMcot CC. [12 80] 

[ 1290 1300 1310 1320 1330 1340 1350 1360] 

[ ........] 

Bpol.mrca AGGGATTAAA GTAAAGCAAT TATGTAAACT CCTTAGGGGA ACCAAAGCAC TAACAGAAGT AGTACCACTA ACAGAAGAAG [1360] 

Bpol.LScot A [1360] 

Bpol.MMcot A [1360] 

[ 1370 1380 1390 1400 1410 1420 1430 1440] 

[ ........ .] 

Bpol.mrca CAGAGCTAGA ACTGGCAGAA AACAGGGAGA TTCTAAAAGA ACCAGTACAT GGAGTGTATT ATGACCCATC AAAAGACTTA [1440] 

Bpol.LScot [1440] 

Bpol.MMcot A [1440] 

[ 1450 1460 1470 1480 1490 1500 1510 1520] 

[ ........] 

Bpol.mrca AT AG CAGAAA TACAGAAGCA GGGGCAAGGC CAATGGACAT ATCAAATTTA TCAAGAGCCA TTTAAAAATC TGAAAACAGG [1520] 

Bpol.LScot [1520] 

Bpol.MMcot [1520] 

[ 1530 1540 1550 1560 1570 1580 1590 1600] 

[ ........] 

Bpol.mrca AAAGTATGCA AGAATGAGGG GTGCCCACAC TAATGATGTA AAACAATTAA CAGAGGCAGT GCAAAAAATA GCCACAGAAA [1600] 

Bpol.LScot [1600] 

Bpol.MMcot ...A [1600] 

[ 1610 1620 1630 1640 1650 1660 1670 1680] 

[ ........] 

Bpol.mrca GCATAGTAAT ATGGGGAAAG ACTCCTAAAT TTAAACTACC CATACAAAAG GAAACATGGG AAGCATGGTG GACAGAGTAT [1680] 

Bpol.LScot A [1680] 

Bpol.MMcot A A [1680] 

[ 1690 1700 1710 1720 1730 1740 1750 1760] 

[ ........] 

Bpol.mrca TGGCAAGCCA CCTGGATTCC TGAGTGGGAG TTTGTCAATA CCCCTCCCTT AGTAAAATTA TGGTACCAGT TAGAGAAAGA [1760] 

Bpol .LScot G [1760] 

Bpol.MMcot G [1760] 

[ 1770 1780 1790 1800 1810 1820 1830 1840] 

[ ........] 

Bpol.mrca AC CCATAGT A GGAGCAGAAA CTTTCTATGT AG ATGGGG C A GCTAATAGAG AGACTAAATT AGGAAAAGCA GGATATGTTA [184 0] 

Bpol .LScot G [1840] 

Bpol.MMcot C. .G [1840] 



[ 1850 1860 1870 1880 1890 1900 1910 1920] 

[ ........] 

Bpol.rarca CTGACAGAGG AAGACAAAAA GTTGTCTCCC TAACTGACAC AACAAATCAG AAGACTGAGT TACAAGCAAT TCATCTAGCT [1920] 

Bpol.LScot [1920] 

Bpol.MMcot ..A [1920] 

[ 1930 1940 1950 1960 1970 1980 1990 2000] 

[ ........] 

Bpol.mrca TTGCAGGATT CGGGATTAGA AGTAAACATA GTAACAGACT CACAATATGC ATTAGGAATC ATTCAAGCAC AACCAGATAA [2000] 

Bpol.LScot [2000] 

Bpol.MMcot [2000] 

[ 2010 2020 2030 2040 2050 2060 2070 2080] 

[ ........] 

Bpol.mrca GAGTGAATCA GAGTTAGTCA GTCAAATAAT AG AG CAGTTA ATAAAAAAGG AAAAGGTCTA CCTGGCATGG GTACCAGCAC [2080] 

Bpol.LScot [2080] 

Bpol.MMcot A [2080] 

[ 2090 2100 2110 2120 2130 2140 2150 2160] 

[ ........] 

Bpol.mrca ACAAAGGAAT TGGAGGAAAT GAACAAGTAG ATAAATTAGT CAGTACTGGA ATCAGGAAAG TACTATTTTT GGATGGAATA [2160] 

Bpol.LScot G [2160] 

Bpol.MMcot G A [2160] 

[ 2170 2180 2190 2200 2210 2220 2230 2240] 

[ ........] 

Bpol.mrca GATAAGGCCC AAGAAGAACA TGAGAAATAT CACAGTAATT GGAGAGCAAT GGCTAGTGAT TTTAACCTGC CAC CTGTAGT [2240] 

Bpol.LScot [2240] 

Bpol.MMcot [2240] 

[ 2250 2260 2270 2280 2290 2300 2310 2320] 

[ ........] 

Bpol.mrca AGCAAAAGAA ATAGTAGC C A GCTGTGATAA ATGTCAGCTA AAAGGAGAAG CCATGCATGG ACAAGTAGAC TGTAGTCCAG [2320] 

Bpol.LScot [2320] 

Bpol.MMcot [2320] 

[ 2330 2340 2350 2360 2370 2380 2390 2400] 

[ ....... .] 

Bpol.mrca GAATATGGCA ACTAGATTGT ACACATTTAG AAGGAAAAGT TATCCTGGTA GCAGTTCATG TAGCCAGTGG CTATATAGAA [24 00] 

Bpol.LScot A [2400] 

Bpol.MMcot A [2400] 

[ 2410 2420 2430 2440 2450 2460 2470 2480] 

[ ........] 

Bpol.mrca GCAGAAGTTA TTCCAGCAGA AACAGGGCAG GAAACAGCAT ACTTTCTCTT AAAATT AG CA GGAAGATGGC CAGTAAAAGT [2480] 

Bpol.LScot G AC [2480] 

Bpol.MMcot G AC [2480] 

[ 2490 2500 2510 2520 2530 2540 2550 2560] 

[ ........] 

Bpol.mrca AATACATACA GACAATGGCA GCAATTTCAC CAGTACTACA GTTAAGGCCG CCTGTTGGTG GGCAGGGATC AAGCAGGAAT [2560] 

Bpol.LScot i G [2560] 

Bpol.MMcot G G. [2560] 

[ 2570 2580 2590 2600 2610 2620 2630 2640] 

t ........] 

Bpol.mrca TTGGCATTCC CTACAATCCC CAAAGT CAAG GAGTAGTAGA ATCTATGAAT AAAGAATTAA AGAAAATTAT AGGACAGGTA [2640] 

Bpol.LScot [2640] 

Bpol.MMcot [2640] 

[ 2650 2660 2670 2680 2690 2700 2710 2720] 

[ ........] 

Bpol.mrca AGAGATCAGG CTGAACATCT TAAGACAGCA GTACAAATGG CAGTATTCAT CCACAATTTT AAAAGAAAAG GGGGGATTGG [2720] 

Bpol.LScot [2720] 

Bpol.MMcot [2720] 



[ 2730 2740 2750 2760 2770 2780 2790 2800] 

[ ........] 

Bpol.mrca GGGGTACAGT GCAGGGGAAA GAATAGTAGA CATAATAGCA ACAGACATAC AAACTAAAGA ACTACAAAAA CAAATTACAA [2800] 

Bpol.LScot T [2800] 

Bpol. MMcot T [2 800] 

[ 2810 2820 2830 2840 2850 2860 2870 2880] 

[ ........] 

Bpol.mrca AAATTCAAAA TTTTCGGGTT TATTACAGGG ACAGCAGAGA TCCACTTTGG AAAGGACCAG CAAAGCTTCT CTGGAAAGGT [2880] 

Bpol.LScot [2880] 

Bpol.MMcot [2880] 

[ 2890 2900 2910 2920 2930 2940 2950 2960] 

[ ........] 

Bpol.mrca GAAGGGGCAG TAGTAATACA AGATAATAGT GACATAAAAG TAGTGCCAAG AAGAAAAGCA AAGATCATTA GGGATTATGG [2960] 

Bpol.LScot [2960] 

Bpol . MMcot [2960] 

[ 2970 2980 2990 3000 3010] 

[ .....] 

Bpol.mrca AAAACAGATG GCAGGTGATG ATTGTGTGGC AAGTAGACAG GATGAGGATT AG [3012] 

Bpol.LScot [3012] 

Bpol .MMcot [3012] 



Figure 13 

Comparison of Clade B rev Gene Sequence Reconstructions 



[ 10 20 30 40 50 60 70 80] 

[ ........] 

Brev.mrca ATGG CAGGAA GAAGCGGAGA CAGCGACGAA GAGCTCCTCA AGACAGTCAG ACTCATCAAG TTTCTCTATC AAAGCAACCC [80] 

Brev.LScot [80] 

Brev.MMcot [80] 

[ 90 100 110 120 130 140 150 160] 

[ ........] 

Brev.mrca GCCTCCCAGC CCCGAGGGGA CCCGACAGGC CCGAAGGAAT AGAAGAAGAA GGTGGAGAGA GAGACAGAGA CAGATCCGTT [160] 

Brev.LScot C G. [160] 

Brev.MMcot C G. [160] 

[ 170 180 190 200 210 220 230 240] 

[ ........] 

Brev.mrca CGATTAGTGA ACGGATTCTT AGCACTTATC TGGGACG AT C TGCGGAGCCT GTGCCTCTTC AGCTACCACC GCTTGAGAGA [24 0] 

Brev.LScot T T. . .C [240] 

Brev.MMcot T C [240] 

[ 250 260 270 280 290 300 310 320] 

[ ........] 

Brev.mrca CTTACTCTTG ATTGTAGCGA GGATTGTGGA ACTTCTGGGA CGCAGGGGGT GGGAAGTCCT CAAATATTGG TGGAATCTCC [32 0] 

Brev.LScot A [320] 

Brev.MMcot [320] 

[ 330 340 350 360] 

[ .] 

Brev.mrca TGCAGTATTG GAGTCAGGAA CTAAAGAATA GTGCTGTTAG [360] 

Brev.LScot .A [360] 

Brev.MMcot [360] 



Figure 14 

Comparison of Clade B tat Gene Sequence Reconstructions 



[ 10 20 30 40 50 60 70 80] 

[ ........] 

Btat.mrca ATGGAGCCAG TAGATCCTAG ACTAGAGCCC TGG AAG CATC CAGGAAGTCA GCCTAAGACT GCTTGTACCA ATTGCTATTG [80] 

Btat.LScot [80] 

Btat.MMcot [80] 

[ 90 100 110 120 130 140 150 160] 

[ ........] 

Btat.mrca TAAAAAGTGT TGCTATCATT GCCAAGTTTG CTTCATAACA AAAGGCTTAG GCATCTCCTA TGGCAGGAAG AAGCGGAGAC [160] 

Btat.LScot T T [160] 

Btat.MMcot T T [160] 

[ 170 180 190 200 210 220 230 240] 

[ ........] 

Btat.mrca AGCGACGAAG ACCTCCTCAA GGCAGTCAGA CTCATCAAGT TTCTCTATCA AAGCAACCCG CCTCCCAGCC CCGAGGGGAC [24 0] 

Btat.LScot G A [240] 

Btat.MMcot G A [240] 

[ 250 260 270 280 290 300 310 320] 

[ ........] 

Btat.mrca CCGACAGGCC CGAAGGAATC GAAGAAGAAG GTGGAGAGAG AGACAGAGAC AGATCCGGTC GATTAGTGAA TGGATTCTTA [320] 

Btat.LScot [320] 

Btat.MMcot G [320] 

[ ] 

[ 1 

Btat.mrca G [321] 

Btat.LScot . [321] 

Btat.MMcot T [321] 



Figure 15 

Comparison of Clade B vif Gene Sequence Reconstructions 



[ 
[ 

Bvif .mrca 
Bvif . LScot 
Bvif . MMCOt 



10 



20 



30 



40 



50 



60 



70 



80] 



ATGGAAAACA GATGGCAGGT GATGATTGTG TGGCAAGTAG ACAGGATGAG GATTAGAACA TGGAAAAGTT TAGTAAAACA 



[80] 
[80] 
[80] 



[ 
[ 

Bvif . mrca 
Bvif . LScot 
Bvif .MMcot 

[ 
t 

Bvif .mrca 
Bvif . LScot 
Bvif .MMcot 

[ 
t 

Bvif .mrca 
Bvif .LScot 
Bvif .MMcot 

[ 
[ 

Bvif .mrca 
Bvif . LScot 
Bvif .MMcot 



90 100 110 120 130 140 150 160] 

.] 

CCATATGTAT ATTTCAAAGA AAGCTAAGGG ATGGTTTTAT AGACATCACT ATGAAAGCAC TCATCCAAGA ATAAGTTCAG 

G 

G 



170 



180 



190 



200 



210 



220 



230 



240] 
.] 

AAGTACACAT CCCACTAGGA GATGCTAGAT TGGTAATAAA AACATATTGG GGTCTGCATA CAGGAGAAAG AGAATGGCAT 

G C C 

G C C 



250 



260 



270 



280 



290 



300 



310 



320] 
.] 

TTGGGTCAGG GAGTCTCCAT AGAATGGAGG AAAAGGAGAT ATAGCACACA AGTAGACCCT GGCCTAGCAG ACCAACTAAT 

A A 

A A 



330 



340 



350 



360 



370 



380 



390 



400] 
.] 

TCATCTGTAT TATTTTGATT GTTTTTCAGA ATCTGCTATA AGAAATGCCA TATTAGGACA TATAGTTAGT CCTAGGTGTG 

C 

C 



[160] 
[160] 
[160] 



[240] 
[240] 
[240] 



[320] 
[320] 
[320] 



[400] 
[400] 
[400] 



[ 
[ 

Bvif .mrca 
Bvif . LScot 
Bvif .MMcot 

t 
E 

Bvif . mrca 
Bvif . LScot 
Bvif .MMcot 

[ 

[ 

Bvif .mrca 
Bvif . LScot 
Bvif . MMcot 



410 420 430 440 450 460 470 480] 

.] 

AATATCAAGC AGGACATAAC AAGGTAGGAT CTCTACAGTA CTTGGCACTA ACAGCATTAA TAACACCAAA AAAGATAAAG 

G 

G 



490 



500 



510 



520 



530 



540 



550 



560] 
.] 

CCACCTTTGC CTAGTGTTAG GAAACTGACA GAGGATAGAT GGAACAAGCC CCAGAAGACC AAGGGCCACA GAGGGAGCCA 

C 

c 



[480] 
[480] 
[480] 



[560] 
[560] 
[560] 



570 



TACAATGAAT GGACACTAG [579] 

[579] 

[579] 



Figure 16 

Comparison of Clade B vpr Gene Sequence Reconstructions 



[ 10 20 30 40 50 60 70 80] 

[ ....... J 

Bvpr.mrca ATGGAACAAG CCCCAGAAGA CCAAGGGCCA CAGAGGGAGC CATACAATGA ATGGACACTA GAGCTTTTAG AGGAGCTTAA [80] 

Bvpr.LScot [80] 

Bvpr.MMcot [80] 

[ 90 100 110 120 130 140 150 160] 

[ ........] 

Bvpr.mrca GAGTGAAGCT GTTAGACATT TTCCTAGGCT ATGGCTCCAT AGCTTAGGAC AACATATCTA TGAAACTTAT GGGGATACCT [160] 

Bvpr.LScot A T. [160] 

Bvpr.MMcot A T. [160] 

[ 170 180 190 200 210 220 230 240] 

[ ........] 

Bvpr.mrca GGGCAGGAGT GGAAGCTATA ATAAGAATTC TGCAACAACT GCTGTTTATT CATTTCAGAA TTGGGTGTCA ACATAGCAGA [240] 

Bvpr.LScot C G [240] 

Bvpr.MMcot C [240] 



[ 250 260 270 280 290] 

[ .....] 

Bvpr.mrca ATAGGCATTA CTCGACAGAG AAGAGCAAGA AATGGAGCCA GTAGATCCTA G [291] 

Bvpr.LScot G [291] 

Bvpr.MMcot G [291] 



Figure 17 

Comparison of Clade B vpu Gene Sequence Reconstructions 



[ 10 20 30 40 50 60 70 80] 

[ .....-...] 

Bvpu.mrca ATGCAACCTT TAGAAATATT AGCAATAGTA G C ATTAGT AG TAG C AGCAAT ACTAGCAATA GTTGTGTGGA CCATAGTATT [80] 

Bvpu.LScot C A [80] 

Bvpu.MMCOt C A [80] 

[ 90 100 110 120 130 140 150 160] 

[ ........] 

Bvpu.mrca CATAGAATAT AGGAAAATAT TAAGGCAAAG AAAAATAGAC AGGTTAATTG ATAGAATAAG AGAAAGAGCA GAAGACAGTG [160] 

Bvpu.LScot A [160] 

Bvpu.MMCOt A [160] 

[ 170 180 190 200 210 220 230 240] 

[ ........] 

Bvpu.mrca GCAATGAGAG TGAAGGGGAT CAGGAAGAAT TATCAGCACT TGTGGAAATG GGGCACCATG CTCCTTGGGA TGTTGATGAT [240] 

Bvpu.LScot G [240] 

Bvpu.MMCOt G [240] 

[ ] 

[ ] 

Bvpu . mrca CTGTAG [246] 

Bvpu.LScot [24 6] 

Bvpu.MMCOt [24 6] 



[ 

[ 

Bgag . mrca 
Bgag . LScot 
Bgag . MMcot 

[ 
[ 

Bgag . mrca 
Bgag . LScot 
Bgag . MMcot 

[ 
[ 

Bgag .mrca 
Bgag .LScot 
Bgag . MMcot 

[ 
( 

Bgag .mrca 
Bgag . LScot 
Bgag . MMcot 

[ 

[ 

Bgag . mrca 
Bgag . LScot 
Bgag . MMcot 

t 

[ 

Bgag. mrca 
Bgag . LScot 
Bgag . MMcot 

[ 
[ 

Bgag .mrca 
Bgag . LScot 
Bgag . MMcot 



Figure 18 

Comparison of Clade B gag Protein Sequence Reconstructions 



10 



20 



30 



40 



50 



60 



70 



80] 



MGARASVLSG GELDKWEKIR LRPGGKKKYK LKHIVWASRE LERFAVNPGL LETSEGCRQI LGQLQPSLQT GSEELRSLYN [80] 

R R [80] 

. . .G K . . R R. .E. .H K. . . . [80] 



90 



100 



110 



120 



130 



140 



150 



160] 
.] 

TVAVLYCVHQ KIEVKDTKEA LDKIEEEQNK S KKKAQQ AAA DTGNSSQVSQ NYPIVQNLQG QMVHQALSPR TLNAWVKVIE [160] 

. . .T R E I V. [160] 

...T N...R...D. .E I..R NP M I V. [160] 



170 



180 



190 



200 



210 



220 



230 



240] 
.] 

EKAFSPEVIP MFSALSEGAT PQDLNTMLNT VGGHQAAMQM LKETINEEAA EWDRLH PVHA GPIAPGQMRE PRGSDIAGTT [240] 

[240] 

[240] 



250 



260 



270 



280 



290 



300 



310 



320] 
.] 

STLQEQIAWM TNNPPIPVGE IYKRWIILGL NKIVRMYSPV SILDIRQGPK EPFRDYVDRF YKTLRAEQAS QEVKNWMTET [320] 

G T [320] 

G. - ,H M T [320] 



330 



340 



350 



360 



370 



380 



390 



400] 
.] 

LLVQNANPDC KTILKALGPG ATLEEMMTAC QGVGGPGHKA RVLAEAMSQV TNSATIMMQR GNFRNPRKTV KCFNCGKEGH [400] 

A Q [400] 

A S A KGQ [4 00] 



410 



420 



430 



440 



450 



460 



470 



480] 
.] 

IARNCRAPRK KGCWKCGKEG HQMKDCTERQ ANFLGKIWPS HKGRPGNFLQ SRPEPTAPPE ESFRFGEETT TPSQKQEQKD [4 80] 

. . K PI. [480] 

P. . . .PR. [480] 

490 500] 
.] 

KELYPLASLK SLFGNDPSSQ [500] 

R [500] 

. . Q . . . T . . R [500] 



Figure 19 

Comparison of Clade B gplSO Protein Sequence Reconstructions 



t 10 20 30 40 50 60 70 80] 

t ........] 

Bgpl60.mrca MRVKGIRKNC QHLWKWGTML LGMLMICSAA ENLWVTVYYG V P VWKEATTT LFCASDAKAY KTEVHNVWAT HACVPTDPNP [80] 

Bgpl60.LScot Y . . . . R K D [80] 

Bgpl60.MMcot Y . . . . R K D [80] 

[ 90 100 110 120 130 140 150 160] 

[ ........] 

Bgpl60.mrca QEWLENVTE NFNMWKNNMV EQMHEDIISL WDQSLKPCVK LTPLCVTLNC TDANKNATNT NSSSGGTMEK GEMKNCSFNI [160] 

Bgpl60. LScot L EM I [160] 

Bgpl60. MMcot L EM I [160] 

[ 170 180 190 200 210 220 230 240] 

[ ........] 

Bgpl60.mrca TTSIRDKMQK EYALFYKLDV VPIDNDNNSN NNTNYRLINC NTSVITQACP KVSFEPIPIH YCTPAGFAIL KCND KKFNGT [240] 

Bgpl60.LScot V T. .T.S....S A [240] 

Bgpl60.MMcot V T. .T.S....S A [240] 

[ 250 260 270 280 290 300 310 320] 

[ ........] 

Bgpl60.mrca GPCKNVSTVQ CTHGIRPWS TQLLLNGSLA EEEWIRSEN FTDNAKTIIV QLNESVEINC TRPNNNTRKS IPIGPGRALY [320] 

Bgpl60.LScot . . .T D H F. [320] 

Bgpl60 .MMcot ...T D H F. [320] 

[ 330 340 350 360 370 380 390 400] 

[ ........] 

Bgpl60.mrca TTGEIIGDIR QAHCNISRAK WNNTLKQW- -TKLREQFGN NKTIVFNPSS GGDPEIVMHS FNCGGEFFYC NTTQLFNSTW [398] 

Bgpl60.LScot I.- -K Q S [398] 

Bgpl60. MMcot I.- -K Q S [398] 

[ 410 420 430 440 450 460 470 480] 

[ ........] 

Bgpl60.mrca NSTEGSNKTT GSNNTGGETI TLPCRIKQII NMWQEVGKAM YAPPIRGQIK CSSNITGLLL TRDGGENSTN ETEIFRPGGG [478] 

Bgpl60.LScot .G.WTW.T.E . ..D.E.D R N.N [478] 

Bgpl60. MMcot .G.WTW.T.E ...D.E.D R N.N [478] 

[ 490 500 510 520 530 540 550 560] 

[ ........] 

Bgpl60.mrca DMRDNWRSEL YKYKWKIEP LGVAPTKAKR RWQREKRAV GIIGAMFLGF LGAAGSTMGA ASMTLTVQAR QLLSGIVQQQ [558] 

Bgpl60.LScot V [558] 

Bgpl60. MMcot V [558] 

[ 570 580 590 600 610 620 630 640] 

[ ........] 

Bgpl60.mrca NNLLRA I EAQ QHLLQLTVWG IKQLQARVLA VERYLRDQQL LGIWGCSGKL ICTTTVPWNA SWSNKSLDKI WNNMTWMEWE [638] 

Bgpl60.LScot A E [638] 

Bgpl60. MMcot A E [638] 

[ 650 660 670 680 690 700 710 720] 

[ ........] 

BgpieO.mrca RE I DNYTGL I YNLIEESQNQ QEKNEQELLE LDKWASLWNW FDITQWLWYI KIFIMIVGGL VGLRIVFAVL SIVNRVRQGY [718] 

Bgpl60.LScot S.. .T N [718] 

Bgpl60. MMcot S.. .T N [718] 

[ 730 740 750 760 770 780 790 800] 

[ ........] 

Bgpl60.mrca SPLSFQTRLP APRGPDRPEG I EEEGGERDR DRSGRLVNGF LALIWDDLRS LCLFSYHRLR DLLLIVARIV ELLGRRGWEA [798] 

Bgpl60 .LScot T [798] 

Bgpl60. MMcot T [798] 

[ 810 820 830 840 850 860 ] 

[ ......] 

Bgpl60.mrca LKYWWNLLQY WSQELKNSAV SLLNATAIAV AEGTDRVIEV VQRACRAILH IPRRIRQGLE RALL [862] 

Bgpl60. LScot T [862] 

Bgpl60 .MMcot T [862] 



Figure 20 

Comparison of Clade B nef Protein Sequence Reconstructions 



[ 
[ 

Bnef .mrca 
Bnef . MMcot 
Bnef . LScot 

[ 
[ 

Bnef .mrca 
Bnef .MMcot 
Bnef . LScot 

( 
[ 

Bnef .mrca 
Bnef .MMcot 
Bnef . LScot 
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60 



70 



80] 

.3 



MGGKWSKRSV VGWPAVRERM RRAEPAADGV GAVSRDLEKH GAITS SNTAA TNAACAWLEA QEEEEVGFPV RPQVPLRPMT [80] 

D [80] 

D [80] 

90 100 110 120 130 140 150 160] 

.] 

YKAAVDLSHF LKEKGGLEGL VYSQKRQDIL DLWVYHTQGY FPDWQNYTPG PGTRYPLTFG WCFKLVPVEP EKVEEATEGE [160] 

. . . . L I I N. . . [160] 

L I I N... [160] 

170 180 190 200 ] 

] 

NNSLLHPMSL HGMDDPEREV LVWRFDS RLA FHHMARE KH P EYYKDC [206] 

. .C Q K K L [206] 

K K L [206] 



Figure 21 

Comparison of Clade B pol Protein Sequence Reconstructions 



[ 10 20 30 40 50 60 70 80] 

[ ....... J 

Bpol.mrca FFRENLAFPQ GKARELSSEQ TRANS PTRRE LQVWGRDNNS LSEAGADRQG TVSFSFPQIT LWQRPLVTIK IGGQLKEALL [80] 

Bpol.LScot ....D F [80] 

Bpol.MMcot . . . .D F [80] 

[ 90 100 110 120 130 140 150 160] 

E ........] 

Bpol.mrca DTGADDTVLE EMNLPGKWKP KMIGGIGGFI KVRQYDQIPI EICGHKAIGT VLVGPTPVNI IGRNLLTQIG CTLNFPISPI [160] 

Bpol.LScot R [160] 

Bpol.MMcot R L [160] 

I 170 180 190 200 210 220 230 240] 

[ ....... .] 

Bpol.mrca ETVPVKLKPG MDGPKVKQWP LTEEKIKALV EICTEMEKEG KISKIGPENP YNTPVFAIKK KDSTKWRKLV DFRELNKRTQ [24 0] 

Bpol.LScot [240] 

Bpol.MMcot [240] 

[ 250 260 270 280 290 300 310 320] 

[ ........] 

Bpol.mrca DFWEVQLGIP HPAGLKKKKS VTVLDVGDAY FSVPLDEDFR KYTAFTIPSI NNETPGIRYQ YNVL PQGWKG SPAIFQSSMT [320] 

Bpol.LScot [320] 

Bpol.MMcot [320] 

[ 330 340 350 360 370 380 390 400] 

[ ........] 

Bpol.mrca KILEPFRKQN PEIVIYQYMD DLYVGSDLEI GQHRTKIEEL REHLLRWGFT TPDKKHQKEP PFLWMGYELH PDKWTVQPIV [400] 

Bpol.LScot D Q [400] 

Bpol.MMcot D Q [400] 

[ 410 420 430 440 450 460 470 480] 

E ........] 

Bpol.mrca LPEKDSWTVN DIQKLVGKLN WASQIYAGIK VKQLCKLLRG TKALTEWPL TEEAELELAE NREILKEPVH GVYYDPSKDL [480] 

Bpol.LScot I [4 80] 

Bpol.MMcot P I [480] 

[ 490 500 510 520 530 540 550 560] 

E ........] 

Bpol.mrca IAEIQKQGQG QWTYQIYQEP FKNLKTGKYA RMRGAHTNDV KQLTEAVQKI ATESIVIWGK TPKFKLPIQK ETWEAWWTEY [560] 

Bpol.LScot [56 0] 

Bpol.MMcot T [560] 

[ 570 580 590 600 610 620 630 640] 

[ ........] 

Bpol.mrca WQATWIPEWE FVNTPPLVKL WYQLEKEPIV GAETFYVDGA ANRETKLGKA GYVTDRGRQK WS LTDTTNQ KTELQAIHLA [64 0] 

Bpol.LScot [64 0] 

Bpol.MMcot N [640] 

[ 650 660 670 680 690 700 710 720] 

[ ....... .] 

Bpol.mrca LQDSGLEVNI VTDSQYALGI IQAQPDKSES ELVSQIIEQL IKKEKVYLAW VPAHKG I GGN EQVDKLVSTG IRKVLFLDGI [720] 

Bpol.LScot A [720] 

Bpol.MMcot A [720] 

[ 730 740 750 760 770 780 790 800] 

[ ....... .] 

Bpol.mrca DKAQEEHEKY HSNWRAMASD FNLPPWAKE IVASCDKCQL KG EAMHGQ VD CSPGIWQLDC THLEGKVILV AVHVASGYIE [800] 

Bpol.LScot [800] 

Bpol . MMcot [800] 

[ 810 820 830 840 850 860 870 880] 

[ ........] 

Bpol.mrca AEVI PAETGQ ETAYFLLKLA GRWPVKVIHT DNGSNFTSTT VKAACWWAGI KQEFGIPYNP QSQGWESMN KELKKIIGQV [880] 

Bpol.LScot T [880] 

Bpol.MMcot T [880] 



[ 890 900 910 920 930 940 950 960] 

[ ........] 

Bpol.mrca RDQAEHLKTA VQMAVFIHNF KRKGGIGGYS AGERIVDIIA TDIQTKELQK QITKIQNFRV YYRDSRDPLW KG PAKLL WKG [960] 

Bpol.LScot [960] 

Bpol.MMcot [960] 

[ 970 980 990 1000 ] 

[ ] 

Bpol.mrca EGAWIQDNS DIKWPRRKA KIIRDYGKQM AGDDCVASRQ DED [1003] 

Bpol.LScot [1003] 

Bpol.MMcot [1003] 



Figure 22 

Comparison of Clade B rev Protein Sequence Reconstructions 



[ 10 20 30 40 50 60 70 80] 

[ ........] 

Brev.mrca MAGRSGDSDE ELLKTVRLIK FLYQSNPPPS PEGTRQARRN RRRRWRERQR QIRSISERIL STYLGRSAEP VPLQLPPLER [80] 

Brev.LScot W P [80] 

Brev.MMcot W P [80] 

[ 90 100 110 ] 

[ ] 

Brev.mrca LTLDCSEDCG TSGTQGVGSP QILVESPAVL ESGTKE [116] 

Brev.LScot N T [116] 

Brev.MMcot [116] 



Figure 23 

Comparison of Clade B tat Protein Sequence Reconstructions 



[ 
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Btat .mrca 
Btat .LScot 
Btat.MMcot 

[ 
[ 

Btat .mrca 
Btat .LScot 
Btat .MMcot 



10 20 30 40 50 60 70 80] 

.] 

MEPVDPRLEP WKHPGSQPKT ACTNCYCKKC CYHCQVCFIT KGLGISYGRK KRRQRRRPPQ GSQTHQVSLS KQPASQPRGD 

F A. . D 

F A. . D 

90 100] 
. ] 

PTGPKESKKK VERETETDPV D [101] 

[101] 

[101] 



[80] 
[80] 
[80] 



Figure 24 

Comparison of Clade B vif Protein Sequence Reconstructions 
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Bvif .mrca 
Bvif .LScot 
Bvif .MMcot 
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80] 
.] 



MENRWQVMIV WQVDRMRIRT WKSLVKHHMY ISKKAKGWFY RHHYESTHPR ISSEVHIPLG DARLVIKTYW GLHTGEREWH [80] 

R T D. . [80] 

R T D.. [80] 



[ 90 100 110 120 130 140 150 160] 

[ ........] 

Bvif .mrca LGQGVSIEWR KRRYSTQVDP GLADQLIHLY YFDCFSESAI RNAILGHIVS PRCEYQAGHN KVGS LQYLAL TALITPKKIK [160] 

Bvif. LScot K D A [160] 

Bvif. MMcot K D A [160] 



Bvif .mrca 
Bvif . LScot 
Bvif .MMcot 



170 180 190 ] 

. ] 

PPLPSVRKLT EDRWNKPQKT KGHRGSHTMN GH [192] 

T [192] 

T [192] 



Figure 25 

Comparison of Clade B vpr Protein Sequence Reconstructions 



[ 10 20 30 40 50 60 70 80] 

[ ........] 

Bvpr.mrca MEQAPEDQGP QREPYNEWTL ELLEELKSEA VRHFPRLWLH SLGQHIYETY GDTWAG VE A I IRILQQLLFI HFRIGCQHSR [80] 

Bvpr.LScot I R. . . [80] 

Bvpr.MMcot I [80] 

[ 90 ] 

[ . ] 

Bvpr.mrca IGITRQRRAR NGASRS [96] 

Bvpr.LScot [96] 

Bvpr.MMcot [96] 



Figure 2 6 

Comparison of Clade B vpu Protein Sequence Reconstructions 
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MQPLEILAIV ALWAAILAI WWTIVFIEY RKILRQRKID RLIDRIRERA EDSGNESEGD QEELSALVEM GHHAPWDVDD [80] 

Q I [80] 

Q I [80] 



[81] 
[81] 
[81] 



Figure 27 

Comparison of Clade C gag Gene Sequence Reconstructions 

[ 10 20 30 40 50 60 70 80] 

[ ........] 

Cgag.mrca ATGGGTGCGA GAGCGTCAAT ATTAAGAGGG GGAAAATTAG ATACATGGGA AAAAATTAGG TTAAGGCCAG GGGGAAAGAA [80] 

Cgag.LScot C [80] 

Cgag.MMcot C [80] 

[ 90 100 110 120 130 140 150 160] 

[ ........] 

Cgag.mrca ACATTATATG ATAAAACACC TAGTATGGGC AAG CAGGG AG CTGGAAAGAT TTG CACTTAA CCCTGGCCTT TTAGAGACAT [160] 

Cgag.LScot C tl60] 

Cgag.MMcot C [160] 

[ 170 180 190 200 210 220 230 240] 

[ ....... .] 

Cgag.mrca CAGAAGGCTG TAAACAAATA AT AAAAC AG C TACAACCAGC TCTTCAGACA GGAACAGAGG AACTTAAATC ATTATATAAC [240] 

Cgag.LScot 6 G [240] 

Cgag.MMcot G G [24 0] 

[ 250 260 270 280 290 300 310 320] 

[ ........] 

Cgag.mrca ACAGTAGCAA CTCTCTATTG TGTACATCAA AGGATAGAGG TACGAGACAC CAAGGAAGCC TTAGACAAGA TAGAGGAAGA [320] 

Cgag.LScot G . . .A [320] 

Cgag.MMcot G. . .A [320] 

[ 330 340 350 360 370 380 390 400] 

[ .....-..] 

Cgag.mrca ACAAAACAAA AGT GAG C AAA AAACACAGCA GGCAGAAGCG GCTGACG GAAAGGTCAG TCAAAATTAT CCTATAGTGC [3 97] 

Cgag . LScot --- [397] 

Cgag.MMcot GCT [4 00] 

[ 410 420 430 440 450 460 470 480] 

[ ....... J 

Cgag.mrca AGAATCTCCA AGGGCAAATG GTACACCAGG CCATATCACC TAGAACTTTG AATG CATGGG TAAAAGTAAT AGAGGAGAAG [477] 

Cgag.LScot [4 77] 

Cgag.MMcot [4 80] 

[ 490 500 510 520 530 540 550 560] 

[ .......-] 

Cgag.mrca GCTTTCAGCC CAGAGGTAAT ACCCATGTTT ACAGCATTAT CAGAAGGAGC CACCCCACAA GATTTAAACA CCATGTTAAA [557] 

Cgag.LScot [557] 

Cgag.MMcot [560] 

[ 570 580 590 600 610 620 630 640] 

[ ....... .3 

Cgag.mrca TACAGTGGGG GGACATCAAG CAGCCATGCA AATGTTAAAA GATACCATCA ATGAGGAGGC TGCAGAATGG GATAGGTTAC [637] 

Cgag.LScot [637] 

Cgag.MMcot [640] 

[ 650 660 670 680 690 700 710 720] 

[ ......-] 

Cgag.mrca ATCCAGTGCA TGCAGGGCCT GTTGCACCAG GCCAAATGAG AGAACCAAGG GGAAGTGACA TAGCAGGAAC TACTAGTACC [717] 

Cgag.LScot A [717] 

Cgag.MMcot A [720] 

[ 730 740 750 760 770 780 790 800] 

[ ........] 

Cgag.mrca CTTCAGGAAC AAATAGCATG GATGACAAGT AACCCACCTA TCCCAGTGGG AGACATCTAT AAAAGATGGA TAATTCTGGG [797] 

Cgag.LScot : G .T [797] 

Cgag.MMcot G .T [800] 

[ 810 820 830 840 850 860 870 880] 

[ .......J 

Cgag.mrca GTTAAATAAA ATAGTAAGAA TGTATAGCCC TGTCAG C ATT TTGGACATAA AACAAGGGCC AAAGGAACCC TTTAGAGACT [877] 

Cgag.LScot [877] 

Cgag.MMcot [880] 



[ 890 900 910 920 930 940 950 960] 

[ ........] 

Cgag.mrca ATGTAGACCG GTTCTTTAAA ACTTTAAGAG CTGAACAAGC TACACAAGAT GTAAAAAATT GGATGACAGA CAC CTTGTTG [957] 

Cgag.LScot [957] 

Cgag.MMcot [960] 

[ 970 980 990 1000 1010 1020 1030 1040] 

[ ........] 

Cgag.mrca GTC CAAAATG CGAACCCAGA TTGTAAGACC ATTTTAAGAG CATTAGGACC AGGGGCTACA CTAGAAGAAA TGATGACAGC [1037] 

Cgag.LScot T [1037] 

Cgag.MMcot T [1040] 

[ 1050 1060 1070 1080 1090 1100 1110 1120] 

[ ........] 

Cgag.mrca ATGT CAGGG A GTGGGAGGAC CTAGCCATAA AGCAAGAGTT TTGGCTGAGG CAATGAGCCA AGCAAACAAT ACAAACATAA [1117] 

Cgag.LScot G....C G [1117] 

Cgag.MMcot G....C G [1120] 

[ 1130 1140 1150 1160 1170 1180 1190 1200] 

[ ........] 

Cgag.mrca TGATGCAGAG AGGCAATTTT AAGGGCCCTA GAAGAATTGT TAAATGTTTC AACTGTGG C A AGGAAGGACA CATAG CCAG A [1197] 

Cgag.LScot A A A G [1197] 

Cgag.MMcot A A A G [1200] 

[ 1210 1220 1230 1240 1250 1260 1270 1280] 

[ ........] 

Cgag.mrca AATTG CAGGG CCCCTAGGAA AAAGGGCTGT TGGAAATGTG GAAAGGAAGG ACACCAAATG AAAGACTGTA CTGAGAGGCA [1277] 

Cgag.LScot A [1277] 

Cgag.MMcot A [1280] 

[ 1290 1300 1310 1320 1330 1340 1350 1360] 

[ ........] 

Cgag.mrca GGCTAATTTT TTAGGGAAAA TTTGGCCTTC CCACAAGGGG AGGCCAGGGA ATTTCCTTCA GAGCAGACCA G AG CCAACAG [1357] 

Cgag.LScot [1357] 

Cgag.MMcot [1360] 

[ 1370 1380 1390 1400 1410 1420 1430 1440] 

[ ........] 

Cgag.mrca CCCCACCAGC AG AGAGCTT C AGGTT CGAGG AGACAACCCC CGCTCCGAAG CAGGAGCCGA AAGACAGGGA ACCCTTAACT [1437] 

Cgag.LScot [1437] 

Cgag.MMcot [1440] 

[ 1450 1460 1470 1480] 

[ . ] 

Cgag.mrca TCCCTCAAAT CACTCTTTGG CAGCGACCCC TTGTCTCAAT AA [1479] 

Cgag.LScot [14 7 9] 

Cgag.MMcot *. . .. [1482] 



[ 
[ 

Cgpl60 .mrca 
Cgpl60 . LScot 
CgplSO . MMcot 

[ 
[ 

Cgpl60 .mrca 
Cgpl60 . LScot 
Cgpl60 .MMcot 

[ 
[ 

Cgpl60 .mrca 
Cgpl60 . LScot 
Cgpl60 .MMcot 

t 
[ 

Cgpl6 0 .mrca 
Cgpl6 0 . LScot 
Cgpl60 .MMcot 

[ 
[ 

Cgpl60 .mrca 
Cgpl60 . LScot 
Cgpl60 .MMcot 

E 
[ 

Cgpl60 .mrca 
Cgpl60 . LScot 
Cgpl60 .MMcot 

[ 
[ 

Cgpl60 .mrca 
Cgpl60 .LScot 
Cgpl60 .MMcot 

[ 
[ 

Cgpl60 .mrca 
Cgpl60 . LScot 
Cgpl60 -MMcot 

[ 
[ 

Cgpl6 0 .mrca 
Cgpl60 . LScot 
Cgpl6 0 .MMcot 

( 
[ 

Cgpl6 0 .mrca 
Cgpl60 . LScot 
Cgpl60 .MMcot 

( 
[ 

Cgpl60 .mrca 
Cgpl60 . LScot 
Cgpl60 .MMcot 



Figure 2 8 

Comparison of Clade C env Gene Sequence Reconstructions 
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ATGAGAGTGA TGGGGATACA GAGGAATTGT CAACAATGGT GGATATGGGG CATCTTAGGC TTTTGGATGT TAATGATTTG [80] 

G T [80] 

G T [80] 
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150 



160] 
.] 

TAGTGTGGTG GGGAACTTGT GGGTCACAGT CTATTATGGG GTACCTGTGT GGAAAGAAGC AAAAACTACT CTATTTTGTG [160] 

. .A C [160] 

..A C [160] 



170 
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240] 
.] 

CATC AG ATG C TAAAGCATAT GAGAGAGAAG TGCATAATGT CTGGGCTACA CATGCCTGTG TACCCACAGA CCCCAACCCA [24 0] 

A [240] 

A [240] 



250 



260 
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290 



300 



310 



320] 
• 3 

CAAGAAATGG TTTTGGAAAA TGTAACAGAA AATTTTAACA TGTGGAAAAA TGACATGGTG GATCAGATGC ATGAGGATAT [320] 

[320] 

[320] 



330 
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390 



400] 
-] 

AATCAGTTTA TGGGATCAAA GCCTAAAGCC ATGTGTAAAG TTGACCCCAC TCTGTGTCAC TTTAAACTGT ACTAATGTTA [400] 

T. . . .G [400] 

T. . . .G [400] 



410 



420 



430 



440 



450 



460 



470 



480] 
.] 

ATAATACTAA TAATACCAAT AGTACCATGA ATGGAGAAAT GAAAAATTGC TCTTTCAATA TAACCACAGA AATAAGAGAT [4 80] 

. .GC...C. ..C A G A G C [480] 

...C...C. ..C A A A G C [480] 

490 500 510 520 530 540 550 560] 

.] 

AAGAAGAAGA AAGAATATGC ACTTTTTTAT AGACTTGATA TAGTACCACT TAATGAAAAT AATAACAATA CTAGTGAATA [560] 

AC TG G G..-.T G. . [560] 

A G G T G. . [560] 

570 580 590 600 610 620 630 640] 

.] 

TAGATTAATA AATTGTAATA CCTCAGCCAT AACACAAGCC TGTCCAAAGG TCT CTTTTG A CCCAATTCCT ATACATTATT [64 0] 

[640] 

[640] 

650 660 670 680 690 700 710 720] 

.] 

GTGCTCCAGC TGGTTATGCG ATTCTAAAGT GTAATAATAA GACATTCAAT GGAACAGGAC CATGCAAAAA TGTCAG CAC A [720] 

T [720] 

T [720] 

730 740 750 760 770 780 790 800] 

.] 

GTACAATGTA CACATGGAAT TAAGCCAGTG GTATCAACTC AACTACTGTT AAATGGTAGT CTAGCAGAAG AAGAGATAAT [800] 

C [800] 

C [800] 

810 820 830 840 850 860 870 880] 

.] 

AATTAG AT CT GAAAATCTGA CAAACAATGC CAAAACAATA ATAGTACAGC TTAATGAATC TGTAGAAATT GTGTGTACAA [880] 

T T [880] 

T [880] 



[ 
[ 

Cgpl60 .mrca 
Cgpl60 . LScot 
CgplSO .MMcot 

[ 
t 

Cgpl60 .mrca 
Cgpl60 .LScot 
Cgpl60 . MMcot 

[ 
t 

Cgpl60 .mrca 
Cgpl60 .LScot 
Cgpl60 .MMcot 

[ 
[ 

Cgpl60 .mrca 
Cgpl60 . LScot 
Cgpl60 .MMcot 

[ 
[ 

Cgpl6 0 .mrca 
Cgpl6 0 . LScot 
Cgpl60 .MMcot 

[ 
[ 

Cgpl60 .mrca 
CgplSO . LScot 
Cgpl60 .MMcot 

[ 
[ 

Cgpl60 .mrca 
Cgpl60 . LScot 
Cgpl60 .MMcot 

[ 
[ 

Cgpl6 0 .mrca 
Cgpl60 . LScot 
Cgpl6 0 .MMcot 

[ 
[ 

Cgpl6 0 .mrca 
Cgpl60 . LScot 
Cgpl60 .MMcot 

[ 
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Cgpl60 .mrca 
Cgpl60 . LScot 
Cgpl60 .MMcot 

[ 
[ 

Cgpl60 .mrca 
Cgpl60. LScot 
Cgpl60 .MMcot 



890 900 910 920 930 940 950 960] 

.] 

GACCCAACAA TAATACAAGA AAAAGTATGA GGATAGGACC AGGACAAACA TTCTATGCAA CAGGAGACAT AATAGGAGAT [960] 

A C [960] 

A C [960] 

970 980 990 1000 1010 1020 1030 1040] 

.] 

ATAAGACAAG CACATTGTAA CATTAGTGGA AGGGAATGGA ATAACACTTT ACAACAGGTA GCTGAAAAAT TAAGAAAACA [1040] 

A. GA A AG G . A GA.G [1040] 

A. GA A AG G.A GA.G.... [1040] 

1050 1060 1070 1080 1090 1100 1110 1120] 

.] 

CTTCCCTAAT AAAACAATAA AATTTGCACC ATCCTCAGGA GGGGACCTAG AAATTACAAC ACATAG CTTT AATTGTAGAG [1120] 

A [1120] 

A [1120] 

1130 1140 1150 1160 1170 1180 1190 1200] 

.] 

GAGAATTTTT CTATTGCAAT ACATCAAAAC TGTTTAATAG TACATACAAT AGTACAAATA GTACAAATTC AACCATCACA [1200] 

G A [1200] 

G [1200] 

1210 1220 1230 1240 1250 1260 1270 1280] 

-] 

CTCCCATGCA GAATAAAACA AATTATAAAC ATGTGGCAGG GGGTAGGACA AGCAATGTAT GCCCCTCCCA TTGCAGGAAA [1280] 

A G [1280] 

A A G [1280] 

1290 1300 1310 1320 1330 1340 1350 1360] 

-] 

CATAACATGT AAATCAAATA TCACAGGACT ACTATTGACA CGTGATGGAG GAAAAAATGA AACTAATGAA ACTGAGACAT [1360] 

GT A. C..A...A.C ..A....T.. [1360] 

GT C..A C ..A....T.. [1360] 

1370 1380 1390 1400 1410 1420 1430 1440] 

.] 

TCAGACCTGG AGGAGGAGAT ATGAGGGACA ATTGGAGAAG TGAATTATAT AAATATAAAG TAGTAGAAAT TAAACCATTA [144 0] 

G G G [1440] 

G G G [1440] 

1450 1460 1470 1480 1490 1500 1510 1520] 

.] 

GGAGTAGCAC CCACTAAGGC AAAAAGGAGA GTGGTGGAGA GAGAAAAAAG AGCAGTGGGA CTAGGAG CTG TGTTCCTTGG [1520] 

. . .A A [1520] 

. . .A A [1520] 



1530 



1540 



1550 



1560 



1570 



1580 



1590 



1600] 
.] 

GTT CTTGGG A GCAGCAGGAA GCACTATGGG CGCAGCGTCA ATAACGCTGA CGGTACAGGC CAGACAATTA TTGTCTGGTA [1600] 

G G [1600] 

G [1600] 



1610 



1620 



1630 



1640 



1650 



1660 



1670 



1680] 
.] 

TAGTG CAAC A GCAAAGCAAT TTG CTGAGGG CTATAGAGGC GCAACAGCAT ATGTTGCAAC TCACAGTCTG GGGCATTAAG [1680] 

G [1680] 

G [1680] 



1690 



1700 



1710 



1720 



1730 



1740 



1750 



1760] 
.] 

CAGCTCCAGG CAAGAGTCCT GGCTATGGAA AGATACCTAA AGGATCAACA GCTCCTAGGG ATTTGGGGCT GCTCTGGAAA [1760] 

A A [1760] 

A A [1760] 



[ 
[ 

Cgpl60 .mrca 
Cgpl60 .LScot 
Cgpl60 . MMcot 

[ 
[ 

Cgpl60 .mrca 
Cgpl60 .LScot 
Cgpl60 .MMcot 

[ 
[ 

CgplGO .mrca 
Cgpl60 .LScot 
Cgpl60 .MMcot 

[ 
[ 

Cgpl60 .mrca 
Cgpl60 . LScot 
Cgpl60 .MMcot 

[ 
[ 

Cgpl60 .mrca 
Cgpl60 .LScot 
CgplGO .MMcot 

[ 
[ 

Cgpl60 .mrca 
Cgpl60 . LScot 
CgplGO . MMcot 

[ 
t 

Cgpl60 .mrca 
Cgpl60 .LScot 
Cgpl60 .MMcot 

[ 
[ 

Cgpl60 .mrca 
Cgpl60 . LScot 
CgplSO .MMcot 

[ 

[ 

Cgpl60 .mrca 
Cgpl60. LScot 
CgplSO .MMcot 

[ 
[ 

Cgpl60 .mrca 
Cgpl60 . LScot 
CgplSO .MMcot 



1770 1780 1790 1800 1810 1820 1830 1840] 

.] 

ACTCATCTGC ACCACTGCTG TGCCTTGGAA CTCTAGTTGG AGTAATAAAT CTCAAGATGA TATTTGGGAT AACATGACCT tl84 0] 

A [1840] 

G [1840] 

1850 1860 1870 1880 1890 1900 1910 1920] 

.] 

GGATGGAGTG GGATAGAGAA ATTAACAATT ACACAGACAC AATATACAGG TTGCTTGAAG AATCGCAAAA CC AG CAGGAA [192 0] 

C GT C [1920] 

C GT C [1920] 

1930 1940 1950 1960 1970 1980 1990 2000] 

.] 

AAAAATGAAC AAGATTTATT GGCATTGGAC AGTTGGGAAA ATCTGTGGAA TTGGTTTGAC ATATCAAATT GGCTGTGGTA [2000] 
C A C. A A A [2000] 



.A . . . 
2010 



.C. A. . 
2020 



[2000] 



2030 



2040 



2050 



2060 



2070 



2080] 
.] 

TATAAAAATA TT CAT AATGA TAGTAGGAGG CTTGATAGGT TTAAGAATAA TTTTTGCTGT GCTTTCTATA GTAAATAGAG [2080] 

G [2080] 

G [2080] 

2090 2100 2110 2120 2130 2140 2150 2160] 

.] 

TTAGGCAGGG ATACTCACCT TTGTCGTTTC AGACC CTTAC CCCAAACCCG AGGGGACCCG ACAGGCTCGA AAGAATCGAA [2160] 

G [2160} 

G [2160] 

2170 2180 2190 2200 2210 2220 2230 2240] 

-] 

GAAGAAGGTG GAGAGCAAGA CAGAGACAGA TCCATTCGAT TAGTGAGCGG ATTCTTAGCA CTTGCCTGGG ACGACCTGCG [224 0] 

[2240] 

[2240] 

2250 2260 2270 2280 2290 2300 2310 2320] 

.] 

GAGCCTGTGC CTCTTCAGCT ACCACCGCTT GAGAGACTTC ATCTTGATTG CAGCGAGGAC TGTGGAACTT CTGGGACGCA [2320] 

A A. . .G.G AG. G [2320] 

A A. . .G.G AG. G [2320] 

2330 2340 2350 2360 2370 2380 2390 2400] 

.3 

GCAGTCTCAG GGGACTACAG AGGGGGTGGG AAGCCCTTAA ATATCTGGGA AGTCTTGTGC AGTATTGGGG TCAGGAGCTA [24 00] 

G T [2400] 

G T [2400] 



2410 



2420 



2430 



2440 



2450 



2460 



2470 



2480] 
-3 



AAAAAGAGTG CTATTAGTCT GCTTGATACC ATAGCAATAG CAGTAGCTGA AGGGACAGAT AGGATTATAG AAGTAGTACA [24 80 3 

A T. .A. . . . [2480] 

A T..A [2480] 



2490 



2500 



2510 



2520 



2530 



2540 



2550] 



AAGAGCTTGT AGAGCT AT CC TCAACATACC TAGAAGAATA AGACAGGGCT TTGAAGCAGC TTTGCAATAA [2550] 

AT G [2550] 

AT G [2550] 



Figure 29 

Comparison of Clade C nef Gene Sequence Reconstructions 

[ 10 20 30 40 50 60 70 80] 

E ........] 

Cnef .mrca ATGGGGGGCA AGTGGTCAAA AAGCAGTATA GTTGGATGGC CTGCTGTAAG AGAAAGAATA AGACGAACTG CTCCAGCAGC [80] 

Cnef.LScot AG [80] 

Cnef.MMcot AG [80] 

[ 90 100 110 120 130 140 150 160] 

[ ........] 

Cnef .mrca AGAAGGAGTA GGAGCAGCGT CTCAAGACTT AGATAAACAT GG AG CACTT A CAAGCAGCAA CACAGCCGCC ACTAATGCTG [160] 

Cnef.LScot . . .G A [160] 

Cnef.MMcot . . .G A [160] 

[ 170 180 190 200 210 220 230 240] 

[ ........] 

Cnef .mrca ATTGTGCCTG G CTGGAAGC A CAAGAGGAGG AAGAAG T AGGCTTTCCA GTCAGACCTC AGGTGCCTTT AAGACCAATG [237] 

Cnef.LScot AAG [240] 

Cnef.MMcot AAG [240] 

[ 250 260 270 280 290 300 310 320] 

( ........] 

Cnef .mrca ACTTATAAGG GAGCAGTCGA TCTCAGCTTC TTTTTAAAAG AAAAGGGGGG ACTGGAAGGG TTAATTTACT CTAAGAAAAG [317] 

Cnef.LScot T [320] 

Cnef.MMcot T [320] 

[ 330 340 350 360 370 380 390 400] 

[ ........] 

Cnef .mrca GCAAGAGATC CTTGATTTGT GGGTCTATCA CACACAAGGC TACTTCCCTG ATTGGCAAAA CTACACACCG GGACCAGGGA [397] 

Cnef.LScot G [400] 

Cnef.MMcot G [4 00] 

[ 410 420 430 440 450 460 470 480] 

[ ........] 

Cnef .mrca TCAGATTTCC ACTGACCTTT GGATGGTGCT TCAAGCTAGT GCCAGTTGAC CCAAGGGAAG TAGAAGAGGC CAATGAAGGA [477] 

Cnef.LScot A C [480] 

Cnef .MMcot A C [4 80] 

[ 490 500 510 520 530 540 550 560] 

[ -.......] 

Cnef .mrca GAGAACAACT GCTTGCTACA CCCTATGAGC CAGCATGGAA TGGAGGATGA AGACAGAGAA GTATTAAAGT GGAAGTTTGA [557] 

Cnef . LScot T [560] 

Cnef.MMcot T [560] 

[ 570 580 590 600 610 620 ] 

[ ......] 

Cnef .mrca CAGTCACCTA GCACGCAGAC ACATGGCCCG CGAGCTACAT CCGGAGTATT ACAAAGACTG CTGA [621] 

Cnef.LScot [624] 

Cnef .MMcot [624] 



I 



Figure 30 

Comparison of Clade C pol Gene Sequence Reconstructions 



[ 
[ 

Cpol .mrca 
Cpol . LScot 
Cpol . MMcot 

[ 
[ 

Cpol .mrca 
Cpol . LScot 
Cpol .MMcot 

[ 
[ 

Cpol .mrca 
Cpol . LScot 
Cpol .MMcot 

[ 
[ 

Cpol .mrca 
Cpol . LScot 
Cpol . MMcot 

[ 
[ 

Cpol .mrca 
Cpol . LScot 
Cpol . MMcot 

E 
C 

Cpol .mrca 
Cpol . LScot 
Cpol . MMcot 

[ 
[ 

Cpol .mrca 
Cpol . LScot 
Cpol . MMcot 

[ 
[ 

Cpol .mrca 
Cpol . LScot 
Cpol . MMcot 

[ 
[ 

Cpol .mrca 
Cpol . LScot 
Cpol . MMcot 

[ 
t 

Cpol .mrca 
Cpol . LScot 
Cpol . MMcot 

[ 
[ 

Cpol .mrca 
Cpol . LScot 
Cpol .MMcot 



10 



20 



30 



40 



50 



60 



70 



80] 



TTTTTTAGGG AAAATTTGGC CTTCCCACAA GGGGAGGCCA GGGAATTTCC TTCAGAGCAG ACCAGAGCCA ACAGCCCCAC [80] 

[80] 

[80] 



90 



100 



110 



120 



130 



140 



150 



160] 
.] 

CAGCAGAGAG CTTCAGGTTC GAGGAGACAA CCCCCGCTCC GAAGCAGGAG CCGAAAGACA GGGAACCCTT AACTTCCCTC [160] 

[160] 

T [160] 



170 



180 



190 



200 



210 



220 



230 



240] 
.] 

AAATCACTCT TTGGCAGCGA CCCCTTGTCT CAATAAAAGT AGGGGGCCAG ATAAAGGAAG CTCTATTAGA TACAGGAGCA [240] 

G C C [240] 

A A. . . C G C C [240] 



250 



260 



270 



280 



290 



300 



310 



320] 
.] 

GATGATACAG TATTAGAAGA CATAAATTTG CCAGGAAAAT GGAAACCAAA AATGATAGGG GGAATTGGAG GTTTTATCAA [320] 

A A [320] 

A A [320] 



330 



340 



350 



360 



370 



380 



390 



400] 

.] 

AGTAAGACAG TATGATCAAA TACTTATAGA AATTTGTGGA AAAAAGGCTA TAGGTACAGT ATTAGTAGGA CCTACACCTG [400] 

[400] 

C [400] 



410 



420 



430 



440 



450 



460 



470 



480] 
.] 

TCAACATAAT TGGAAGAAAT ATGTTGACTC AGCTTGGTTG CACTCTAAAT TTTC CAATTA GTCCTATTGA AACTGT AC CA [480] 

A A C [480] 

A A C [480] 



490 



500 



510 



520 



530 



540 



550 



560] 
.] 

GTAAAATTAA AGCCAGGAAT GGATGGCCCA AAGGTTAAAC AATGGCCATT GACAGAAGAG AAAATAAAAG CATTAACAGC [560] 

[560] 

C [560] 



570 



580 



590 



600 



610 



620 



630 



640] 
.] 

AATTTGTGAA GAAATGGAAA AGGAAGGAAA AATTACAAAA ATTGGGCCTG AAAATCCATA TAACACTCCA GTATTTGCCA [640] 

G [640] 

G [640] 



650 



660 



670 



680 



690 



700 



710 



720] 
.] 

TAAAAAAGAA GGACAGTACT AAGTGGAGAA AATTAGTAGA TTTCAGAGAA CTCAATAAAA GAACTCAAGA CTTCTGGGAA [720] 

G T [720] 

G T [720] 



730 



740 



750 



760 



770 



780 



790 



800] 
.] 

GTTCAATTAG GAATACCACA CCCAGCAGGG TTAAAAAAGA AAAAATCAGT AACAGTACTG GATGTGGGGG ATGCATATTT [800] 

G [800] 

G G [800] 



810 



820 



830 



840 



850 



860 



870 



880] 
.] 

TTCAGTTCCT TTAGATGAAG ACTTCAGGAA ATATACTGCA TTCACCATAC CTAGTATAAA CAATGAAACA C CAGGGATT A [880] 

G [880] 

G [880] 



[ 890 900 910 920 930 940 950 960] 

E ........] 

Cpol.mrca GATATCAATA TAATGTG CTT CCACAGGGAT GGAAAGGATC ACCAGCAATA TTCCAGAGTA GCATGACAAA AATCTTAGAG C960] 

Cpol .LScot [960] 

Cpol.MMcot [960] 

t 970 980 990 1000 1010 1020 1030 1040] 

t -.......] 

Cpol.mrca CCCTTTAGGG CACAAAACCC AGAAATAGTT ATCTATCAAT ACATGGATGA CTTGTATGTA GG AT CTG ACT TAGAAATAGG [1040] 

Cpol. LScot T C T [1040] 

Cpol.MMcot T T • [1040] 

[ 1050 1060 1070 1080 1090 1100 1110 1120] 

[ ........] 

Cpol.mrca G CAACAT AGA GCAAAAATAG AGGAGTTAAG AGAACATCTA TTGAAATGGG GATTTACCAC ACCAGACAAG AAACATCAGA [1120] 

Cpol. LScot A. .G [1120] 

Cpol.MMcot G [1120] 

[ 1130 1140 1150 1160 1170 1180 1190 1200] 

[ ........] 

Cpol.mrca AAGAACCCCC ATTTCTTTGG ATGGGGTATG AACTCCATCC TGACAAATGG ACAGTACAGC CTATACAGCT GCCAGAAAAG [1200] 

Cpol. LScot [1200] 

Cpol.MMcot [1200] 

[ 1210 1220 1230 1240 1250 1260 1270 1280] 

[ ........] 

Cpol.mrca GATAGCTGGA CTGTCAATGA TATACAGAAG TTAGTGGGAA AATTAAACTG GGCAAGT CAG ATTTACCCAG GGATTAAAGT [1280] 

Cpol. LScot [1280] 

Cpol.MMcot [12 80] 

[ 1290 1300 1310 1320 1330 1340 1350 1360] 

[ ........] 

Cpol.mrca AAGGCAACTG TGTAAACTCC TTAGGGGAGC CAAAGCACTA ACAGACATAG TACCACTGAC TGAAGAAGCA GAATTAGAAT [1360] 

Cpol. LScot T G A [1360] 

Cpol.MMcot A [1360] 

[ 1370 1380 1390 1400 1410 1420 1430 1440] 

[ ........] 

Cpol.mrca TGGCAGAGAA CAGGGAAATT CTAAAAGAAC CAGTACATGG AGTATATTAT GACCCATCAA AAGACTTAAT AGCTGAAATA [1440] 

Cpol. LScot G [1440] 

Cpol.MMcot [1440] 

[ 1450 1460 1470 1480 1490 1500 1510 1520] 

[ ........] 

Cpol.mrca CAGAAACAGG GGCATGACCA ATGGACATAT CAAATTTACC AAGAAC C ATT CAAAAATCTG AAAACAGGAA AGTATGCAAA [1520] 

Cpol. LScot G [1520] 

Cpol.MMcot G [1520] 

[ 1530 1540 1550 1560 1570 1580 1590 1600] 

[ ........] 

Cpol.mrca AATGAGGTCT GCCCACACTA ATGATGTAAA ACAATTAACA GAAGCAGTGC AAAAAATAGC CATGGAAAGC ATAGTAATAT [1600] 

Cpol. LScot A G G [1600] 

Cpol.MMcot A G G [1600] 

[ 1610 1620 1630 1640 1650 1660 1670 1680] 

t ........] 

Cpol.mrca GGGGAAAGAC TCCTAAATTT AGACTACCCA TCCAAAAAGA AACATGGGAG ACATGGTGGA CAGACTATTG GCAAGCCACC [1680] 

Cpol. LScot [1680] 

Cpol.MMcot T G G [1680] 

[ 1690 1700 1710 1720 1730 1740 1750 1760] 

t ....... .] 

Cpol.mrca TGGATTCCTG AGTGGGAGTT TGTTAATACC CCTCCCCTAG TAAAATTATG GTACCAGCTA GAAAAAGAAC CCATAGCAGG [1760] 

Cpol. LScot G . .G [1760] 

Cpol.MMcot G . .G [1760] 



[ 1770 1780 1790 1800 1810 1820 1830 1840] 

[ ........] 

Cpol.mrca AGCAGAAACT TTCTATGTAG ATGGGGCAGC TAATAGGGAA ACTAAACTAG GAAAAGCAGG GTATGTTACT GACAAAGGAA [1840] 

Cpol.LScot A A G [1840] 

Cpol.MMcot A A G [1840] 

[ 1850 1860 1870 1880 1890 1900 1910 1920] 

[ ........] 

Cpol.mrca GACAGAAAGT TGTTTCTCTA ACTGAAACAA CAAATCAGAA GACTGAATTA CAAGCAATTC AGCTAGCTTT GCAGGATTCA [1920] 

Cpol.LScot .G A A [1920] 

Cpol.MMcot .G A [1920] 

[ 1930 1940 1950 1960 1970 1980 1990 2000] 

[ ........] 

Cpol.mrca GGATCAGAAG TAAACATAGT AACAGACTCA CAATATGCAT TAGGAATCAT TCAAGCACAA CCAGATAAGA GTGAATCAGA [2000] 

Cpol.LScot G [2000] 

Cpol.MMcot G [2000] 

[ 2010 2020 2030 2040 2050 2060 2070 2080] 

[ ........] 

Cpol.mrca GTTAGTCAAT CAAATAATAG AG CAGTT AAT AAAAAAGGAA AAGGTCTACC TGTCATGGGT ACCAGCACAT AAAGGAATTG [2080] 

Cpol.LScot C A. .A G [2080] 

Cpol.MMcot A G [2080] 

[ 2090 2100 2110 2120 2130 2140 2150 2160] 

[ ........] 

Cpol.mrca GAGGAAATGA ACAAGTAGAT AAATTAGTAA GTTCTGGAAT CAGGAAAGTG CTGTTTCTAG ATGGAATAGA TAAAGCTCAA [2160] 

Cpol.LScot AG G [2160] 

Cpol.MMcot AG G [2160] 

[ 2170 2180 2190 2200 2210 2220 2230 2240] 

[ ........] 

Cpol.mrca GAAGAACATG AAAAATATCA CAGCAATTGG AGAGCAATGG CTAGTGAGTT TAATCTGCCA CCCATAGTAG CAAAAGAAAT [224 0] 

Cpol.LScot G G [2240] 

Cpol.MMcot G [2240] 

[ 2250 2260 2270 2280 2290 2300 2310 2320] 

[ ........] 

Cpol.mrca AGTAGCTAGC TGTGATAAAT GT C AGCT AAA AGGGGAAGCC ATGCATGGAC AAGTAGACTG TAGTC CAGGG ATATGGCAAT [2320] 

Cpol.LScot A [2320] 

Cpol.MMcot A [2320] 

[ 2330 2340 2350 2360 2370 2380 2390 2400] 

[ ........] 

Cpol.mrca TAGATTGTAC ACATTTAGAA GGAAAAGTTA TCCTGGTAGC AGTCCATGTA GCCAGTGGCT ACATAGAAGC AGAAGTT AT C [2400] 

Cpol.LScot A.C G [2400] 

Cpol.MMcot A.C G [2400] 

[ 2410 2420 2430 2440 2450 2460 2470 2480] 

[ ........] 

Cpol.mrca CCAGCAGAAA CAGGACAGGA AACAG CAT AC TTTATATTAA AATTAGCAGG AAGATGGCCA GTAAAAGTAA TACATACAGA [2480] 

Cpol.LScot A C C [2480] 

Cpol.MMcot A C C [2480] 

[ 2490 2500 2510 2520 2530 2540 2550 2560] 

[ ........] 

Cpol.mrca CAATGGCAGC AATTTCACCA GTGCTG CAGT TAAGGCAGCC TGTTGGTGGG CAGGTATCCA ACAGGAATTT GGAATTCCCT [2 560] 

Cpol.LScot T [2560] 

Cpol.MMcot T A [2560] 

[ 2570 2580 2590 2600 2610 2620 2630 2640] 

[ ........] 

Cpol.mrca ACAATCCCCA AAGTCAGGGA GTAGTAGAAT CCATGAATAA AGAATTAAAG AAAATCATAG GGCAGGTAAG AGATCAAGCT [2640] 

Cpol.LScot [2640] 

Cpol.MMcot [2640] 

[ 2650 2660 2670 2680 2690 2700 2710 2720] 

[ ........] 

Cpol.mrca GAGCACCTTA AG AC AG CAGT ACAAATGGCA GTATTCATTC ACAATTTTAA AAGAAAAGGG GGGATTGGGG GGTACAGTGC [2720] 

Cpol.LScot [2720] 

Cpol.MMcot [2720] 



t 2730 2740 2750 2760 2770 2780 2790 2800] 

[ ........] 

Cpol.mrca AGGGGAAAGA ATAATAGACA TAATAGCAAC AGACATACAA ACTAAAGAAT TACAAAAACA AATTATAAAA ATTCAAAATT [2800] 

Cpol.LScot [2800] 

Cpol.MMcot [2800] 

[ 2810 2820 2830 2840 2850 2860 2870 2880] 

[ ........] 

Cpol.mrca TT CGGGTTT A TTACAGAGAC AGCAGAGACC CTGTTTGGAA AGGACCAGCC AAACTACTCT GGAAAGGTGA AGGGG CAGTA [2880] 

Cpol.LScot A [2880] 

Cpol.MMcot A [2880] 

[ 2890 2900 2910 2920 2930 2940 2950 2960] 

[ ........] 

Cpol.mrca GTAATACAAG ACAATAGTGA CATAAAGGTA GTACCAAGGA GGAAAGCAAA GATCATTAGG GATTATGGAA AACAGATGGC [2960] 

Cpol.LScot T A A. ..C [2960] 

Cpol.MMcot T..C A.. A. ..C [2960] 

[ 2970 2980 2990 3000] 

[ ... .] 

Cpol.mrca AGGTG CTG AT TGTGTGGCAG GTAGACAGGA TGAAGATTAG [3 000] 

Cpol.LScot [3000] 

Cpol.MMcot [3000] 



Figure 31 

Comparison of Clade C rev Gene Sequence Reconstructions 



[ 10 20 30 40 50 60 70 80] 

[ ....... .3 

Crev.mcra ATGGCAGGAA GAAGCGGAGA CAGCGACGAA GCGCTCCTCC AAGCAGTGAG GATCATCAAA ATCCTATATC AAAGCAACCC [80] 

Crev.LScot T [80] 

Crev.MMcot [80] 

[ 90 100 110 120 130 140 150 160] 

[ ........] 

Crev.mcra TTACCCCAAA CCCGAGGGGA CCCGACAGGC TCGAAGGAAT CGAAGAAGAA GGTGGAGAGC AAGACAGAGA CAGATCCATT [160] 

Crev.LScot G.A [160] 

Crev.MMcot G.A [160] 

[ 170 180 190 200 210 220 230 240] 

[ ........] 

Crev.mcra CGATTAGTGA GCGGATTCTT AGCACTTGCC TGGGACGACC TGCGGAGCCT GTGCCTCTTC AGCTACCACC GCTTGAGAGA [24 0] 

Crev.LScot A [240] 

Crev.MMcot T A [240] 

[ 250 260 270 280 290 300 310 320] 

[ ........] 

Crev.mcra CTTCATCTTG ATTGCAGCGA GGACTGTGGA ACTTCTGGGA CGCAGCAGTC T CAGGGG ACT ACAGAGGGGG TGGGAAGCCC [320] 

Crev.LScot A... G.GA AG. A [320] 

Crev.MMcot A... G.GA AG. A [320] 

[ 330 340 350 360 370 380] 

[ ......] 

Crev.mcra TTAAATATCT GGGAAGCCTT GTGCAGTATT GGGGTCAGGA GCTAAAAAAG AGTGCTATTA G [381] 

Crev.LScot . . . .G T T... A [381] 

Crev.MMcot . . . .G T T... A [381] 



Figure 32 

Comparison of Clade C tat Gene Sequence Reconstructions 



[ 10 20 30 40 50 60 70 80] 

[ ........] 

Ctat.mrca ATGGAGCCAG TAGATCCTAA CCTAGAGCCC TGGAACCATC CAGGAAGTCA GCCTAAAACT GCTTGTAATA AATGTTATTG [80] 

Ctat.LScot C G [80] 

Ctat.MMcot C G [80] 

[ 90 100 110 120 130 140 150 160] 

[ ........] 

Ctat.mrca TAAAAAATGT AGCTATCATT GTCTAGTTTG CTTTCTGACA AAAGGCTTAG GCATTTCCTA TGGCAGGAAG AAGCGGAGAC [160] 

Ctat.LScot ....C.C A [160] 

Ctat.MMcot ....C.C A [160] 

[ 170 180 190 200 210 220 230 240] 

[ ........] 

Ctat.mrca AGCGACGAAG AGCTCCTCCA AGCAGTGAGG ATCATCAAAA TCCTATATCA AAGCAACCCT TATCCCAAAC CCGAGGGGAC [240] 

Ctat.LScot C C [240] 

Ctat.MMcot C C [240] 

[ 250 260 270 280 290 300 ] 

[ ......] 

Ctat.mrca CCGACAGGCT CGGAGGAATC GAAGAAGAAG GTGGAGAGCA AGACAGAGAC AGATCCGTGC GATTAG [306] 

Ctat.LScot A A.T [306] 

Ctat.MMcot A.T [306] 



Figure 33 

Comparison of Clade C vif Gene Sequence Reconstructions 



[ 
[ 

Cvif .mrca 
Cvif . LScot 
Cvif . MMcot 

t 
[ 

Cvif .mrca 
Cvif . LScot 
Cvif .MMcot 

[ 
[ 

Cvif .mrca 
Cvif . LScot 
Cvif .MMcot 



10 



20 



30 



40 



50 



60 



70 



80] 
• ] 



ATGGAAAACA GATGG CAGGT GCTGATTGTG TGG CAGGT AG ACAGGATGAA GATTAGAACA TGGAATAGTT TAGTAAAACA [80] 

G.. (80] 

G. . [80] 



90 



100 



110 



120 



130 



140 



150 



160] 
.] 

CCATATGTAT GTTTCAAGGA GAGCTAAAGG ATGGTTTTAT AGACATCACT ATGAAAGCAG ACATCCAAAA ATAAGTTCAG [160] 

T C T G [160] 

T C T G [160] 



170 



180 



190 



200 



210 



220 



230 



240] 
.] 

AAGTACACAT CCCATTAGGG GATGCTAGAT TAGTAATAAA AACATATTGG GGTTTGCATA CAGGAGAAAG AGATTGGCAT [240] 

A [240] 

[240] 



[ 250 260 270 280 290 300 310 320] 

[ ........] 

Cvif .mrca TTGGGTCATG GAGTCTCCAT AGAATGGAGA CTGAGAAGAT ATAGCACACA AGTAGACCCT GGCCTGGCAG ACCAACTAAT [320] 

Cvif. LScot T G [320] 

Cvif. MMcot T G [320] 

[ 330 340 350 360 370 380 390 400] 

[ ........] 

Cvif .mrca T C ATATG CAT TATTTTGATT GTTTTGCAGA CTCTGCCATA AGGAAAGCCA TATTAGGACA TATAGTTAGC CCTAGGTGTG [400] 

Cvif. LScot A C TT [400] 

Cvif. MMcot A C TT [400] 



[ 
[ 

Cvif .mrca 
Cvif . LScot 
Cvif .MMcot 

[ 
[ 

Cvif .mrca 
Cvif .LScot 
Cvif .MMcot 



410 420 430 440 450 460 470 480] 

.] 

ACT AT CAAGC AGGACATAAC AAGGTAGGAT CTCTACAATA CTTGGCACTG ACAGCATTAA TAAAACCAAA AAAGATAAAG [4 80] 

T G [480] 

T G [480] 



490 



500 



510 



520 



530 



540 



550 



560] 
.] 



CCACCTCTGC CTAGTGTTAA GAAATTAGTA GAGGATAGAT GGAACAAGCC CCAGAAGACC AGGGGCCACA GAGGGAGCCA [560] 

G G A. . . [560] 

G G A. . . [560] 



[ 
[ 

Cvif .mrca 
Cvif .LScot 
Cvif .MMcot 



570 ] 
] 

TACAATGAAT GGACACTAG 



[579] 
[579] 
[579] 



Figure 34 

Comparison of Clade C vpr Gene Sequence Reconstructions 



[ 
t 

Cvpr . mrca 
Cvpr . LScot 
Cvpr . MMcot 

[ 
t 

Cvpr .mrca 
Cvpr . LScot 
Cvpr . MMcot 

[ 
[ 

Cvpr .mrca 
Cvpr . LScot 
Cvpr .MMcot 

[ 
[ 

Cvpr .mrca 
Cvpr . LScot 
Cvpr . MMcot 



10 



20 



30 



40 



50 



60 



70 



80] 



ATGGAACAAG CCCCAGAAGA CCAGGGGCCA CAGAGGGAGC CATACAATGA ATGGACACTA GAGCTTTTAG AGGAACTTAA [80] 

G A A C. . [80] 

G A C . [80] 



90 



100 



110 



120 



130 



140 



150 



160] 
.] 

GCAGGAAGCT GTCAGACATT TTCCTAGACC ATGGCTCCAT AGCTTAGGAC AACATATCTA TGAAACCTAT GGGGATACTT [160] 

C T [160] 

C [160] 



170 



180 



190 



200 



210 



220 



230 



240] 
.] 

GGGCGGGAGT TGAAGCTATA ATAAGAATTC TGCAACAACT ACTGTTTATT CATTTCAGAA TTGGGTGCCA ACATAGCAGA [24 0] 

. -A. A C C A G [240] 

. .A C G [240] 



250 



260 



270 



280 



290] 
. ] 

ATAGGCATTA TTCGACAGAG AAGAGCAAGA AATGGAGCCA GTAGATCCTA A [291] 

T .G [291] 

G [291] 



« II 



Figure 35 

Comparison of Clade C vpu Gene Sequence Reconstructions 



[ 10 20 30 40 50 60 70 80] 

[ ........] 

Cvpu.mrca ATGTTAGATT TAATAGCAAG AGTAGATTAT AGATTAGGAG TAGGAGCATT GATAGTAGCA CTAATCATAG CAATAGTTGT [80] 

Cvpu.LScot C [80] 

Cvpu.MMcot C [80] 

[ 90 100 110 120 130 140 150 160] 

[ ........] 

Cvpu.mrca GTGGACCATA GTATATATAG AATATAGGAA ATTGGTAAGA CAAAGAAAAA TAGACTGGTT AATTAAAAGA ATTAGGGAAA [160] 

Cvpu.LScot T [160] 

Cvpu.MMcot T [160] 

[ 170 180 190 200 210 220 230 240] 

[ ........] 

Cvpu.mrca GAGCAGAAGA CAGTGGCAAT GAGAGTGATG GGGATACAGA GGAATTGTCA ACACTGGTGG ATATGGGGCA TCTTAGGCTT [24 0] 

Cvpu.LScot .G T A [240] 

Cvpu.MMcot G T A A [240] 

[ 250 260] 

[ • 3 

Cvpu.mrca TTGGATGTTA ATGATTTGTA A [261] 

Cvpu.LScot [261] 

Cvpu.MMcot [261] 



[ 
[ 

Cgag . rarca 
Cgag . LScot 
Cgag . MMcot 

[ 
[ 

Cgag .mrca 
Cgag . LScot 
Cgag . MMcot 

[ 
[ 

Cgag. mrca 
Cgag .LScot 
Cgag . MMcot 

[ 
[ 

Cgag . mrca 
Cgag . LScot 
Cgag . MMcot 

[ 
[ 

Cgag . mrca 
Cgag . LScot 
Cgag . MMcot 

[ 
[ 

Cgag. mrca 
Cgag .LScot 
Cgag . MMcot 

[ 
[ 

Cgag . mrca 
Cgag . LScot 
Cgag . MMcot 



Figure 36 

Comparison of Clade C gag Protein Sequence Reconstructions 



10 



20 



30 



40 



50 



60 



70 



80] 



MGARASILRG GKLDTWEKIR LRPGGKKHYM IKHLVWASRE LERFALNPGL LETSEGCKQI IKQLQPALQT GTEELKS LYN 

L M R 

L M R. . . . 



[80] 
[80] 
[80] 



90 



100 



110 



120 



130 



140 



150 



160] 
.] 

TVATLYCVHQ RIEVRDTKEA LDKIEEEQNK SQQKTQQAEA -ADGKVSQNY PIVQNLQGQM VHQAISPRTL NAWVKVIEEK 

E K - 

E K A 



170 



180 



190 



200 



210 



220 



230 



240] 
.] 

AFSPEVIPMF TALSEGATPQ DLNTMLNTVG GHQAAMQMLK DTINEEAAEW DRLHPVHAGP VAPGQMREPR GSDIAGTTST 



250 



260 



270 



280 



290 



300 



310 



320] 
.] 

LQEQIAWMTS NPPIPVGDIY KRWIILGLNK IVRMYSPVSI LDIKQGPKEP FRDYVDRFFK TLRAEQATQD VKNWMTDTLL 

V 

V 



330 



340 



350 



360 



370 



380 



390 



400] 
.3 

VQNANPDCKT ILRALGPGAT LEEMMTACQG VGGPSHKARV LAEAMSQANN TN I MMQRGNF KGPRRIVKCF NCGKEGHIAR 

G S K 

G S K 



410 



420 



430 



440 



450 



460 



470 



480] 
.] 

NCRAPRKKGC WKCGKEGHQM KDCTERQANF LGKIWPSHKG RPGNFLQSRP EPTAPPAESF RFEETTPAPK QEPKDREPLT 



[159] 
[159] 
[160] 



[239] 
[239] 
[240] 



[319] 
[319] 
[320] 



[399] 
[399] 
[400] 



[479] 
[479] 
[480] 



490 ) 
] 

SLKSLFGSDP LSQ 



[492] 
[492] 
[493] 



Figure 37 

Comparison of Clade C gpl60 Protein Sequence Reconstructions 



[ 10 20 30 40 50 60 70 80] 

[ ........] 

Cgpl60.mrca MRVMGIQRNC QQWWIWGILG FWMLMICSW GNLWVTVYYG VPVWKEAKTT LFCASDAKAY EREVHNVWAT HACVPTDPNP [80] 

Cgpl60.LScot . ..R..L N K [80] 

Cgpl60.MMcot . ..R..L N K [80] 



[ 90 100 110 120 130 140 150 160] 

[ ........] 

Cgpl60.mrca QEMVLENVTE NFNMWKNDMV DQMHEDI ISL WDQSLKPCVK LTPLCVTLNC TNVNNTNNTN STMNGEMKNC SFNITTEIRD [160] 

Cgpl60.LScot S...A..T.. N..K..I A [160] 

Cgpl60.MMcot S...T..T.. N..K..I V...L.. [160] 

[ 170 180 190 200 210 220 230 240] 

[ ........] 

Cgpl60.mrca KKKKEYALFY RLDIVPLNEN NNNTSEYRLI NCNTSAITQA CPKVSFDPIP IHYCAPAGYA I LKCNNKTFN GTG PCKNVST [240] 

Cgpl60.LScot ..Q.V S.S N. . . - [240] 

Cgpl60. MMcot S N . . . . [240] 

[ 250 260 270 280 290 300 310 320] 

[ ........] 

Cgpl60.mrca VQCTHGIKPV VSTQLLLNGS LAEEEIIIRS ENLTNNAKTI IVQLNESVEI VCTRPNNNTR KSMRIGPGQT FYATGDIIGD [320] 

Cgpl60.LScot V H I [320] 

Cgpl60. MMcot H I [320] 

[ 330 340 350 360 370 380 390 400] 

[ .........] 

Cgpl60.mrca IRQAHCNISG REWNNTLQQV AEKLRKHFPN KTIKFAPSSG GDLEITTHSF NCRGEFFYCN TSKIiFNSTYN STNSTNSTIT [400] 

Cgpl60.LScot E E...K...R. GK..EE E G T... [400] 

Cgpl60.MMcot E B...K...R. GK..EE E G [400] 

[ 410 420 430 440 450 460 470 480] 

[ ........] 

Cgpl60.mrca LPCRIKQIIN MWQGVGQAMY APPIAGNITC KSNITGLLLT RDGGKNETNE TETFRPGGGD MRDNWRSELY KYKWEIKPL [480] 

Cgpl60 .LScot E..R V N. .N ..I [480] 

Cgpl60.MMcot .Q E..R V D..D ..I [480] 



[ 490 500 510 520 530 540 550 560] 

[ ........] 

Cgpl6 0.mrca GVAPTKAKRR WERE KRAVG LGAVFLGFLG AAGSTMGAAS ITLTVQARQL LSGIVQQQSN LLRAIEAQQH MLQLTVWGIK [560] 

Cgpl60. LScot .1 I [560] 

Cgpl60. MMcot .1 I [560] 



[ 
[ 

Cgpl60 .mrca 
Cgpl60 . LScot 
Cgpl60 . MMcot 

[ 
[ 

Cgpl60 .mrca 
Cgpl60 . LScot 
Cgpl60 .MMcot 



570 580 590 600 610 620 630 640] 

.] 

QLQARVLAME RYLKDQQLLG IWGCSGKLIC TTAVPWNSSW SNKSQDDIWD NMTWMEWD RE I NNYTDT I YR LLEESQNQQE [64 0] 

. . .T. . . .1 E Q S D [640] 

- . .T. . . .1 E Q S D [640] 



650 



660 



670 



680 



690 



700 



710 



720] 
.] 

KNEQDLLALD SWENLWNWFD ISNWLWYIKI FIMIVGGLIG LRIIFAVLSI VNRVRQGYSP LSFQTLTPNP RGPDRLERIE [720] 

Q. .K K T G. . . [720] 

Q. .K K T G. . . [720] 



[ 730 740 750 760 770 780 790 800] 

[ ........] 

Cgpl60.mrca EEGGEQDRDR SIRLVSGFLA LAWDDLRSLC LFSYHRLRDF ILIAARTVEL LGRSSLRGLQ RGWEALKYLG SLVQYWGQEL [800] 

Cgpl60. LScot V. . .A L. . [800] 

Cgpl60. MMcot V...A L. . [800] 



[ 810 820 830 840 ] 

C ] 

Cgpl60.mrca KKSAISLLDT IAIAVAEGTD RIIEWQRAC RAILNIPRRI RQGFEAALQ [849] 

Cgpl60. LScot LI.. I. ...R [849] 

Cgpl60. MMcot LI . . I . ...R [849] 



Figure 3 8 

Comparison of Clade C nef Protein Sequence Reconstructions 



10 20 30 40 50 60 70 80] 

.] 

MGGKWSKSSI VGWPAVRERI RRTAPAAEGV GAASQDLDKH GALTSSNTAA TNADCAWLEA QEEE-EVGFP VRPQVPLRPM [79] 

E N E [80] 

E N E [80] 

90 100 110 120 130 140 150 160] 

.] 

TYKGAVDLSF FLKEKGGLEG LIYSKKRQEI LDLWVYHTQG YFPDWQNYTP GPGIRFPLTF GWCFKLVPVD PREVEEANEG [159] 

F V.Y [160] 

F V.Y [160] 

170 180 190 200 ] 

] 

ENNCLLHPMS QHGMEDEDRE VLKWKFDSHL ARRHMARELH PEYYKDC [2 06] 
[207] 



I 



Figure 3 9 

Comparison of Clade C pol Protein Sequence Reconstructions 

[ 10 20 30 40 50 60 70 80] 

[ ........] 

Cpol.mrca FFRENLAFPQ GEAREFPSEQ TRANS PTSRE LQVRGDNPRS EAGAERQGTL NFPQITLWQR PLVSIKVGGQ I KEALLDTGA [80] 

Cpol.LScot [80] 

Cpol.MMcot L T L [80] 

[ 90 100 110 120 130 140 150 160] 

[ ....... J 

Cpol.mrca DDTVLEDINL PGKWKPKMIG GIGGFIKVRQ YDQILIEICG KKAIGTVLVG PTPVNIIGRN MLTQLGCTLN FPISPIETVP [160] 

Cpol .LScot E [160] 

Cpol.MMcot E [160] 

[ 170 180 190 200 210 220 230 240] 

[ ........] 

Cpol.mrca VKLKPGMDGP KVKQWPLTEE KIKALTAICE EMEKEGKITK IGPENPYNTP VFAI KKKDST KWRKLVDFRE LNKRTQDFWE [240] 

Cpol.LScot [240] 

Cpol.MMcot [240] 

[ 250 260 270 280 290 300 310 320] 

[ ........] 

Cpol.mrca VQLGIPHPAG LKKKKSVTVL DVGDAYFSVP LDEDFRKYTA FTIPSINNET PGIRYQYNVL PQGWKGSPAI FQSSMTKILE [320] 

Cpol.LScot G [320] 

Cpol.MMcot G [320] 

[ 330 340 350 360 370 380 390 400] 

[ ........] 

Cpol.mrca PFRAQNPEIV IYQYMDDLYV GSDLEIGQHR AKIEELREHL LKWGFTTPDK KHQKEPPFLW MGYELHPDKW TVQPIQLPEK [400] 

Cpol.LScot [400] 

Cpol.MMcot [4 00] 

[ 410 420 430 440 450 460 470 480] 

[ ........] 

Cpol.mrca DSWTVNDIQK LVGKLNWASQ IYPGIKVRQL CKLLRGAKAL TDIVPLTEEA ELELAENREI LKE PVHGVYY DPSKDLIAEI [480] 

Cpol.LScot [4 80] 

Cpol.MMcot [4 80] 

[ 490 500 510 520 530 540 550 560] 

[ ........] 

Cpol.mrca QKQGHDQWTY QIYQEPFKNL KTGKYAKMRS AHTNDVKQLT EAVQKIAMES IVIWGKTPKF RLPIQKETWE TWWTDYWQAT [560] 

Cpol.LScot T [560] 

Cpol.MMcot T A [560] 

[ 570 580 590 600 610 620 630 640] 

[ ........] 

Cpol.mrca WIPEWEFVNT P PLVKLWYQL EKEPIAGAET FYVDGAANRE TKLGKAGYVT DKGRQKWSL TETTNQKTEL QAIQLALQDS [64 0] 

Cpol.LScot I R....I [640] 

Cpol.MMcot I R....I [64 0] 

[ 650 660 670 680 690 700 710 720] 

[ ........] 

Cpol.mrca GSEVNIVTDS QYALGIIQAQ PDKSESELVN QIIEQLIKKE KVYLSWVPAH KGIGGNEQVD KLVSSGIRKV LFLDGIDKAQ [720] 

Cpol.LScot R [72 0] 

Cpol.MMcot R [72 0] 

[ 730 740 750 760 770 780 790 800] 

[ ........] 

Cpol.mrca EEHEKYHSNW RAMASEFNLP PIVAKEIVAS CDKCQLKGEA MHGQVDCSPG IWQLDCTHLE GKVILVAVHV ASGYIEAEVI [800] 

Cpol.LScot I I [800] 

Cpol.MMcot I I [800] 

[ 810 820 830 840 850 860 870 880] 

[ ........] 

Cpol.mrca PAETGQETAY FILKLAGRWP VKVIHTDNGS NFTSAAVKAA CWWAGIQQEF GIPYNPQSQG WESMNKELK KIIGQVRDQA [880] 

Cpol.LScot [880] 

Cpol.MMcot [880] 



[ 890 900 910 920 930 940 950 960] 

[ ........] 

Cpol.mrca EHLKTAVQMA VFIHNFKRKG GIGGYSAGER IIDIIATDIQ TKELQKQIIK IQNFRVYYRD SRDPVWKGPA KLLWKGEGAV [960] 

Cpol.LScot I [960] 

Cpol.MMcot I [960] 

[ 970 980 990 J 

[ ] 

Cpol.mrca VIQDNSDIKV VPRRKAKI I R DYGKQMAGAD CVAGRQDED [999] 

Cpol.LScot K [999] 

Cpol.MMcot K [999] 



Figure 40 

Comparison of Clade C rev Protein Sequence Reconstructions 



[ 
[ 

Crev.mcra 
Crev. LScot 
Crevl . MMcot 

[ 
[ 

Crev.mcra 
Crev. LScot 
Crevl .MMcot 



10 20 30 40 50 60 70 80] 

.] 

MAGRSGDSDE ALLQAVRIIK ILYQSNPYPK PEGTRQARRN RRRRWRARQR QIHSISERIL STCLGRPAEP VPLQLPPLER [80) 

K I.. [80] 

K F I.. [80] 

90 100 ] 

] 

LHLDCSEDCG TSGTQQSQGT TEGVGSP [107] 

. .IGD..SS [107] 

. . IGD . .SS [107] 



Figure 41 

Comparison of Clade C tat Protein Sequence Reconstructions 



t 10 20 30 40 50 60 70 80] 

[ ........] 

Ctat.mrca MEPVDPNLEP WNHPGSQPKT ACNKCYCKKC SYHCLVCFLT KGLGISYGRK KRRQRRRAPP SSEDHQNPIS KQPLSQTRGD [80] 

Ctat.LScot P H Q S P [80] 

Ctat.MMcot P H Q S P [80] 

[ 90 100] 

[ . ] 

Ctat.mrca PTGSEESKKK VESKTETDPC D [101] 

Ctat.LScot F . [101] 

Ctat.MMcot F . [101] 



ti 



[ 
[ 

Cvif .mrca 
Cvif . LScot 
Cvif . MMcot 

[ 
[ 

Cvif .mrca 
Cvif . LScot 
Cvif .MMcot 

[ 
t 

Cvif .mrca 
Cvif .LScot 
Cvif .MMcot 



Figure 42 

Comparison of Clade C vif Protein Sequence Reconstructions 
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80] 



MENRWQVLIV WQVDRMKIRT WNSLVKHHMY VSRRAKGWFY RHHYESRHPK ISSEVHIPLG DARLVIKTYW GLHTGERDWH [80] 

N V Q [80] 

N V [80] 



90 



100 



110 



120 



130 



140 



150 



160] 
.] 

LGHGVSIEWR LRRYSTQVDP GLADQLIHMH YFDCFADSAI RKAILGHIVS PRCDYQAGHN KVGSLQYLAL TALIKPKKIK [160] 

I [160] 

I [160] 

170 180 190 ] 

PPLPSVKKLV EDRWNKPQKT RGHRGSHTMN GH [192] 

R R . . N [192] 

R R..N [192] 



Figure 43 

Comparison of Clade C vpr Protein Sequence Reconstructions 

[ 10 20 30 40 50 60 70 80] 

[ ........] 

Cvpr.mrca MEQAPEDQGP QREPYNEWTL ELLEELKQEA VRHFPRPWLH SLGQHIYETY GDTWAGVEAI IRILQQLLFI HFRIGCQHSR 

Cvpr.LScot I Y T L 

Cvpr . MMcot T . . . . L 

[ 90 ] 

[ ] 

Cvpr.mrca IGIIRQRRAR NGASRS [96] 

Cvpr.LScot . . . L [96] 

Cvpr . MMcot . . . M [96] 



[80] 
[80] 
[80] 



I I 



Figure 44 

Comparison of Clade C vpu Protein Sequence Reconstructions 



[ 10 20 30 40 50 60 70 80] 

t ........] 

Cvpu.mrca MLDLIARVDY RLGVGALIVA LIIAIWWTI VYIEYRKLVR QRKIDWLIKR IRERAEDSGN ESDGDTEELS TLVDMGHLRL [80] 

Cvpu.LScot . . . . L L E M [80] 

Cvpu.MMcot . . . . L L E M [80] 

[ ] 
[ ] 

Cvpu .mrca LDVNDL [86] 

Cvpu.LScot [86] 

Cvpu.MMcot [86] 



Figure 45A 
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Figure 45B 



4e 



•87USSG3X 
r AUC18 
■ AUC18MBC 

■ AUMBCC18B 



■AUMBCC54 



-AUMBCD36 



- AUMBCC98 



•AUMBC200 



HIVBH102 
HTVF12CG 



rHT 



r- HIVLAICG 
l-HIVMCKl 
T-HIVPV22 
■ — HIVNL43 



•HTVJC16 



■89SP061 
-US89.6 



• HIVJRCSF 
■ HIVJRFL 

■ MVMN 

AUMBC925 



rHIVYUlOX 
LfflVYU2X 



-USDH123 
•88USWR27 



■CNRL42CG 



-USAD8 
■D31 



■MANC 



— HIVOYI 

— HIVSF2CG 
-HLACH320B 

— HTVHAH2 



■NLACH320A 

HIVCAM1 

— HTVNY5CG 
HIVRF 



-HIWEAU160 



HIVELICG 

HIV22Z6 

HTVNDK 



Figure 46. • Deduced ancestor protein sequences 
A. 

SIVBK28 ancestor (Env segment) 

hnCSETDRWGLTKShffiTSSCIAQNNCTGLEQEQMISCKFNMTGLKIUDKTKEYNETW 

YSTDLVCEQGNSTDNESRCYlVnfflCNTSVIQESCDKHYWDTIRFRYCAPPGYALLRC 

NDTNYSGFMPKCSKVWSSCTRMMETQTSTWFGFNGTRAENRTYIYWHGRDNRTII 

SLNKYYNLTMKCRRPGNKTVLPVTIMSGLVFHSQPINDRPKQAWCWFGGKWKDAI 

KEVKQTIVKHPRYTGTNNTDKINLTAPGGGDPEVTFMWTNCRGEFLYCKMNWFLN 

WVEDRDVTTQRPKERHRRNYWCHIRQIINTWHKVGKNVYLPPREGDLTCNSTVTS 

LIANTDWTDGNQTNITMSAEVA 

B. 

ANl-EnvB 

MRVKGIRKNYQHLWRWGTMLLGMLMICSAAEKLWVTVYYGVPVWKEATTTLFC 

ASDAKAYDTEVHm^WATHACWTDPNPQEVVLENVTENFlSIMWKNNMVEQMH 

IISLWDQSLKPCVKLTPLCVTLNCTDDLRTNATNTTNSSATTNTTSSGGGTMEGEKG 

EIKNCSFNVTTSIRDKJVIQKEYALFYKiDVVPIDNDNNNTNNNTSYRLINCNTSVITQ 

ACPKVSFEPIPfflYCTPAGFAILKCNDKKFNGTGPCTNVSTVQCTHGIRPVVSTQLLL 

NGSLAEEEWIRSENFTDNAKTirVQLNESVEINCTRPNNNTRKSIPIGPGRALYATGK 

IIGDIRQAHCNLSRAKWNNTLKQIVTKLREQFGNNKTTIVFNQSSGGDPEIVMHSFN 

CGGEFFYCNSTQLFNSTWHFNGTWGNNNTERSNNAADDNDTITLPCRIKQIINMWQ 

EVGKAMYAPPISGQIRCSSMTGLLLTRDGGNNENTNNTDTEIFRPGGGDMRDNWRS 

ELYKYKVVKIEPLGVAPTKAKRRWQREKSAVGMLGAMFLGFLGAAGSTMGAAS 

MTLTVQARQLLSGrVQQQNNLLRAffiAQQHLLQLTVWGIKQLQARVLAVERYLKD 

QQLLGrWGCSGKLICTTAWWNASWSNKSLDKIWNNMTWMEWEREroNYTGLIYT 

LffiESQNQQEKhffiQELLELDKWASLWNWFDITNWLWYIKIFIMrVGGLVGLRrVFAV 

LSrVNRVRQGYSPLSFQTHLPAPRGPDRPEGIEEEGGERDRDRSGRLVNGFLALRVD 

DLRSLCLFSYHRLSDLLLIVARJVELLGRRGWEALKYWWNLLQYWSQELKNSAVSL 

LNATAIAVAEGTDRVIEWQRACRAILHIPRRIRQGLERALL 



